hey, i'm trying to screenscrape this webpage http://www.huds.harvard.edu/foodpro/center_frame.asp?naFlag=1&sName=HARVARD+UNIVERSITY+DINING+SERVICES&locationNum=05&locationName=Hot+Entrees%2C+Starches%2C+Bean%2FGrain+and+Vegetables%3C%2Ffont%3E%3C%2Fa%3E%3Cbr%3E%3Cfont%3E%26nbsp%3C%2Ffont%3E%3Ca%3E%3Cfont%3E and i want to store the text for the meals in a sql database
so several questions:
1) i have split up each line into an array, but i'm pretty sure all the html is still in there - say a particular element of an array has html in it, and some others have the text i want. while its not 100% critical to get rid of it, i would prefer to in order to save space and make life easier in the long run. any ideas?
2) i'm getting a Parse error: syntax error, unexpected T_STRING on line 27 which is
$insert = sprintf("INSERT INTO 'menu' ('%d') VALUES ('%s')", $i,
$contents[$i]);
my guess is that it has something to do with %d for the row name, but i'm not sure. i have it set up this way so i can put each element of the array in a different row - i don't think this is really the best set-up efficiency-wise, but i just want the easiest way to do it that will be easily searchable afterwards.
here is my whole code, any help would be much appreciated!
<?php
//require common code (this connects me to database)
require_once("inc/common.inc");
//open HUDS website for reading
$url =
"http://www.huds.harvard.edu/foodpro/center_frame.asp?naFlag=1&sName=HARVARD+UNI$
$handle = fopen($url, "r");
//grab contents from webpage
$contents = file_get_contents($url);
echo "$contents";
fclose($handle);
//split into array
$contents = explode("\n", $contents);
for($i = 0; $i < strlen($contents); $i ++)
{
$empty = "TRUNCATE TABLE `menu`";
//execute query
$query = mysql_query($empty);
if (!query)
apologize("Couldn't empty table);
$insert = sprintf("INSERT INTO 'menu' ('%d') VALUES ('%s')", $i,
$contents[$i]);
if(!$insert)
apologize("Could not insert %s into table", $contents[$i]);
//execute query
$result = mysql_query($insert);
if(!result)
apologize("Error! Check primary key.");
?>