Just want to say Hi first of all, this is my first post and I am needing some help. Ive searched google and many other tutorials along with php.net and I cant seem to figure this out. I originally got help with this code from someone I can no longer get a hold of, and so im stuck now and need some help.

Anyways I have this code below that worked perfect about 3-4 weeks ago and now for some reason its not working. The code took values of stock prices from a website listed in the code and put those values into my website.

I have not changed anything with my code so I am guessing they may have changed something on their end. I would like to know what may be wrong with this code and if I am not searching the right fields or variables.

Please let me know if you have any ideas. Thanks.

<?php
$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, "http://www.bloomberg.com/markets/commodities/energyprices.html");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, 0);

$contents = curl_exec($ch);
curl_close($ch);

function find_values ($string, $page)
{
	$string = preg_quote($string, '#');

// takes everything from the given string to end of row
preg_match("#$string(.*)</tr>#Us", $page, $match);

// Get the values from the row we found previously
preg_match_all("#<span[^>]*>([^<]*)</span>#s", $match[1], $values);

// Return the values	
return $values[1];
}

$find1 = find_values('nymex crude future', $contents);
echo "Nymex Crude Future: Price = $find1[0], Change = $find1[1], & Change = $find1[2], Time = $find1[3]<br>";

$find2 = find_values('Dated Brent Spot', $contents);
echo "Nymex Heating Oil Future: Price = $find2[0], Change = $find2[1], & Change = $find2[2], Time = $find2[3]<br>";

$find3 = find_values('WTI Cushing Spot', $contents);
echo "Nymex RBOB Gasoline Future: Price = $find3[0], Change = $find3[1], & Change = $find3[2], Time = $find3[3]<br>";))>


?>

    I have not changed anything with my code so I am guessing they may have changed something on their end.

    Start looking at their clientside source then.

      laserlight wrote:

      Start looking at their clientside source then.

      I have looked at their site source code and it seems to be fine to what i am searching for. Here is their site source code, I think im going crazy or blind because I cant find the mismatch

      <td><span class="tbl_txt">Nymex Crude Future</span></td><td align="right"><span class="tbl_num">91.13</span></td><td align="right"><span class="tbl_txt_green">1.05</span></td><td align="right"><span class="tbl_txt_green">1.17</span></td><td align="right"><span class="tbl_num">11:08</span></td>
      

      .

        Well the first regex grabs data from the string up until a '</tr>', and since the latter of the two doesn't appear in your HTML snippet, I'd say their site source code does not seem to be fine.

          Of course, as soon as the site's publishers change the layout of the site you're going to be starting all over again.

          Oh, and I think I should point out that you're in violation of Bloomberg's Terms of Service.

            Of course, as soon as the site's publishers change the layout of the site you're going to be starting all over again.

            Oh, and I think I should point out that you seem to be in violation of Bloomberg's Terms of Service.

              Bradgrafelamn, Do you have any suggestions on what part of my code is incorrect? If you can give me an example on what you see thats wrong?

              Thanks.

                Well basically what I see is wrong is what Weedpacket mentioned; you're attempting to scrape data from HTML source that can change and break your code without warning. If you really want to include the data from a remote site, you should look into contacting that site and see if they will provide you with some sort of XML feed.

                Other than that, your regexp pattern is looking for HTML that isn't there. Learn up on regexp's (I recommend Regular-Expressions.info) and adjust it according to the current HTML code of the remote site.

                  Looks like I needed to add a few lines of code.

                  curl_setopt($ch, CURLOPT_FILETIME, true); //My Habbit
                  curl_setopt($ch, CURLOPT_REFERER,"");  //I belong here
                  curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)");  //Im not scrapping i am a browser
                  

                    Once you have your agreement for a machine-readable feed in place with bloomberg.com, they will provide a feed in a documented, convenient format that you can develop code with and will remain stable.

                    Until then you're at the mercy of their team:
                    1. Blocking your bot for service abuse
                    2. Sending junk data to your bot
                    3. Sending the lawyers around to sue your ass.

                    Mark

                      Write a Reply...