ok I've figured it out, i didn't need to use print, so far the script displays all links and the first URLs meta tags. But, how can I pick out a link from the array so then the script can search within that link and so on. It needs to be an on going spider.
Also, I've stuck at the last part of my code with shuffle :S could someone tell me why i can an error: Warning: shuffle() expects parameter 1 to be array, string given in C:\Inetpub\vhosts*httpdocs\get.php on line 51
<?php
$username="*";
$password="*";
$database="*_";
mysql_connect(localhost,$username,$password);
@mysql_select_db($database) or die( "Unable to select database");
$input = file_get_contents("http://www.ebay.co.uk");
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
if(preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
foreach($matches as $match) {
?>
<br><br>
<?php
echo $match[2];
#echo $match[3];
}
#New un
?><br><br>
<?php
$input2 = file_get_contents("http://www.ebay.co.uk");
$regexp2 = "<META NAME=[^>]*CONTENT=(\"??)([^\">]*?)\\1[^>]*>";
if(preg_match_all("/$regexp2/siU", $input2, $matches2, PREG_SET_ORDER)) {
foreach($matches2 as $matchh) {
echo $matchh[2];
#echo $match[3];
$query = "INSERT INTO nero VALUES ('Ebay','$match[2]','$matchh[2]')";
mysql_query($query);
}
}
}
while($i < 100)
{
?><br><br>
<?php
$input = file_get_contents("$match[2]");
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
if(preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
foreach($matches as $match) {
echo $match[2];
#echo $match[3];
}
}
shuffle($match);
//get the first index
$match = $match[0];
$i = $i + 1;
}
?>
thanks.