Hey. i am tring to code a regular expression to take the code of a site (i already have the code as a string using the curl library 🙂 ) and find html image tags,and copy whats in the quotation marks scr="url.com/something/something/page.html" then replace whats in the quotation marks with a couple of sparced together variables.
If you have any questions please post them.
heres an example:
this is the site code im scraping from
now the domain name is http://asdf.com/ (the user types this into a form.
<html>
<body>
<img scr="89asdf.gif">
</body>
</html>
the problem right now is that if i were going to display this code as is, then it wouldn't show the picture because i don't have it on my host.
this is the code so far:
<html>
<head>
<title>goto</title>
</head>
<body>
<?php
$ch;
$urlb = $_POST['url'];
if (isset($_POST['submit']))
{
//
// The PHP curl module supports the received page to be returned in a variable
// if told.
//
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$urlb);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
$result = curl_exec ($ch);
curl_close ($ch);
echo $result;
echo "0258";
}
else
{
echo '<form method="post" action="goto.php">';
echo 'Type in the exact url address';
echo '<input name="url" type="text" value="http://www.">';
echo '<input name="submit" type="submit" value="GO">';
echo '</form>';
}
?>
</body>
</html>
(just to say i set my code up so it would post from a form on a different page... but if you went straight here then it would ask for it on that page.)
so the problem is that i can't display the pictures.
my idea was to take the scr from the other site, see how many slashes were in it (0 would mean it was in the same subdirectory and i had to add the domain and subdirectories that were there, 1 or more without the domain name in it meant it was in a lower subdirectory then this site meaning i would add the same as if there were no slashes, the domain name and altleast 1 slash it meant that the entire url was there and you didn't need to add or change anything.) and then use the url and add what is needed to that section to make a complete url...
again if there are any questions please post them and ill do my best 😃
From here down has been solved.
OK ive got another problem, and i need this to be complete very soon, so i need help fast.
the problem is again im using the curl library to get code from another site, what my code does is:
1. Takes the code from the website
2. It will run a regular expression that will match <html> anything in between </html>
3. it will take that code and cut it down farther until it is raw information (i have not done this step because i can not get the previes on to work.
4. it will display the information in a special format.
the code is:
<html>
<head>
<title></title>
</head>
<body bgcolor="BLACK">
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"[COLOR="Red"]the test page[/COLOR]");
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
$result=curl_exec ($ch);
curl_close ($ch);
echo "<font color = \"white\">complete</font><BR>";
if(preg_match ("|<html>([a-zA-z0-9!@#\$%\^&\*()_+-=><\?\`~\f\r\t\n\v]*)</html>|", $result, $returnb))
{
echo "<font color = \"white\">complete</font><BR>";
}
else
{
echo "<font color = \"white\">error</font><BR>";
}
echo "<font color = \"white\">complete</font><BR>";
echo "<font color = \"white\">$return[1]</font><BR>";
?>
</body>
</html>
the problem is that it won't take information from multible line (i mean it can get parts of one line but i need multible lines of code to be taken (hints i need it to include enter))
thanks in advance