I'm attempting to search a web page for a text link with a specific text description and pull the link URL. In other words, I'm going out to a web page such as http://www.victorious.org/links/, and want to find the linked text "Federal Bureau of Investigation" (found the page I'm searching) and return the URL for that specific link. I also need for it to ignore linked images, etc.
I could use any help you could provide. Thanks!
Below ismy starter script.
<?
$url = "http://www.victorious.org/links/";
$input = @file_get_contents($url) or die("Could not access file: $url");
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
if(preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
foreach($matches as $match) {
# $match[2] = link address
# $match[3] = link text
echo "<a href=$match[2]>$match[3]</a><br>\n";
}
}
?>