I'm building a link-checker and am in the process of making sure that the link to be checked is not in a commented out section of the HTML. Some of the pages have huge sections commented out which seem to be too much text for an array element, so no array element is made. The code I'm using is:
$commenttext=@implode ('', file ($url));
$commenttext=str_replace("\n","",$commenttext);
$commenttext=str_replace("\r","",$commenttext);
$commenttext=str_replace("-->","-->\n",$commenttext);//seems to need this
$commentOutBits=array();
preg_match_all("/<!--(.*)-->/",$commenttext,$commentOutBits);
$commentedOutText_imploded=implode("",$commentOutBits[1]);
if(!strstr($commentedOutText_imploded,$href)){then check it}
This works for most of the comments, just not for the huge chunks. These chunks are the equivalent of a table structure for one full screen of images and text (at a screen resolution of 1024 x 768). My client is a very touchy and I'd rather not have them see results for links that are commented out. (yes, unfortunately they want to be able to run the checker themselves).
Thanks,
Biggus