Hi, i've been fiddling with this for a bit now and i'm at a loss as to why it won't work. Anyway, here is the code;
include ("simple_html_dom.php");
$search_url = 'http://www.diy.com/diy/jsp/bq/nav.jsp?fh_view_size=10&fh_eds=%3f&fh_reffacet=styleStyle&fh_location=%2f%2fcatalog01%2fen_GB%2fcategories%3C{9372014}%2fcategories%3C{9372035}%2fcategories%3C{9372170}%2fspecificationsProductType%3dbathroom_fittings%2fstyleStyle%3E{axis}&fh_refview=summary&fh_refpath=facet_159017215&ts=1280346500445';
$html_search = file_get_html($search_url);
echo "<html><body>";
echo "<table border=1>";
$url_append = "http://www.diy.com";
$i = 1;
foreach ($html_search->find('a[class=productDetailLink]') as $link) {
$link_url[$i] = $url_append.$link->href;
$i++;
}
foreach ($link_url as $product_url) {
$html_product = file_get_html($product_url);
$return[desc] = $html_product->find('div[class=productInfo] h1', 0)->plaintext;
$return[ean] = substr($html_product->find('div[class=productInfo] p[class=ean]', 0)->plaintext,-13,13);
$return[price] = substr($html_product->find('div[class=productInfo] p[class=productPrice] span[class=onlyPrice]', 0)->plaintext,12);
$return[image] = "<img src=".$html_product->find('div[class=productHero] div[class=noscript] img', 0)->src.">";
echo "<tr>";
/*foreach ($return as $k) {
echo "<td>".$k."</td>";
}
echo "</tr>";*/
}
echo "</table>";
echo "</body></html>";
I'm trying to extract some basic product information from the B&Q website by extracting the URLs from a search page and then opening each one and extracting the product information. In the code above the search URL has been hardcoded for testing purposes.
The problem i'm having is that when I run this code it seems to output the details for only two products but multiple times (the same as there are URL's) and in no specific order i.e. if I run it twice they might be in a different order. Slightly random to say the least. I've split up the foreach parts to make sure the URL's being passed on are correct and running print_r against $link_url shows 7 URL's, all different and all working if you copy them into a browser window.
If I manually set up the array with the correct links (i.e $link_url[x] = "url") it works fine yet outputting $product_url yields the same result regardless of which way I setup the $link_url array. As far as I can see the url's being passed to $html_product are the same yet it only works when they are manually created, is there something incredibly obvious i'm missing here?