Hey all.
I'm trying to use cURL to automate a task currently done manually.
Manually, this is what is done. Go to website, click on a link which brings up a "login" page. On that page there is only an ID though, no password. Enter ID and click "ok" button. Once that is done, a message pops up saying ID has been validated.
Then you click on another link that displays a page with 7 fields. You enter three fields and click submit and the remaining 4 fields are populated with values. Those values then need to be entered into our system.
The page that displays the final link is "main menu". i can call that page fine. But when I try to call the program that is the target of that link, I get a message that reads "Redirecting you to here". When I follow that link in the browser, it displays FW-1 access denied (401).
If I put the value of the redirect string into my PHP script, I get the following message:
FW-1 at piglet: FW-1 form has expired.
I googl'd that message and found the following:
A:
This seems to be a bug that occurs with 4.0SP1 and 4.0SP2 installations on all platforms. I have been able to reproduce this with Partially or Fully Automatic Client Auth. Consider the following rulebase:
Source Destination Service Action
AllUsers@Any Internal-Net Any Client-Auth
AllUsers@Any Internal-Net HTTP User Auth
If Client Auth could potentially apply for HTTP and Partially Automatic Client Auth is used, the Client Auth rule will cause this error message after the client auth rule "expires". It may happen in other cases as well.
The workaround is to make sure no Client Auth rule applies for HTTP using any of the automatic sign-on facilities. 4.0SP3 is supposed to fix this problem.
..."
Is there a way around this? What am I doing wrong? I ordered a book on the use of screen scraping, webbots etc but I have a feeling that once it arrives, it won't cover this type of scenario.
I've copied several different functions from different places on the web and have experimented with them. Here is the script I'm testing wtih:
"....
<?
function testPage()
{
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';
$target_url = "http://apps.acehardware-vendors.com/dcas/splash.asp";
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$htmlerr = curl_exec($ch);
if (!$htmlerr) {
echo "<br />cURL error number:" .curl_errno($ch);
echo "<br />cURL error:" . curl_error($ch);
exit;
}
$result = curl_exec($ch);
}
function getPage($url, $referer, $timeout, $header)
{
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';
if(!isset($timeout))
$timeout=30;
$curl = curl_init();
if(strstr($referer,"://")){
curl_setopt ($curl, CURLOPT_REFERER, $referer);
}
curl_setopt ($curl, CURLOPT_URL, $url);
curl_setopt ($curl, CURLOPT_TIMEOUT, $timeout);
curl_setopt ($curl, CURLOPT_USERAGENT, $userAgent);
curl_setopt ($curl, CURLOPT_HEADER, (int)$header);
curl_setopt ($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt ($curl, CURLOPT_SSL_VERIFYPEER, 0);
$html = curl_exec ($curl);
curl_close ($curl);
return $html;
}
function postPage($url,$pvars,$referer,$timeout)
{
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';
if(!isset($timeout))
$timeout=30;
$curl = curl_init();
$post = http_build_query($pvars);
if(isset($referer)){
curl_setopt ($curl, CURLOPT_REFERER, $referer);
}
curl_setopt ($curl, CURLOPT_URL, $url);
curl_setopt ($curl, CURLOPT_TIMEOUT, $timeout);
curl_setopt ($curl, CURLOPT_USERAGENT, $userAgent);
curl_setopt ($curl, CURLOPT_HEADER, 0);
curl_setopt ($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION,true);
curl_setopt ($curl, CURLOPT_POST, 1);
curl_setopt ($curl, CURLOPT_POSTFIELDS, $post);
curl_setopt ($curl, CURLOPT_HTTPHEADER,
array("Content-type: application/x-www-form-urlencoded"));
$html = curl_exec ($curl);
$result = curl_exec($curl);
if (curl_errno($curl))
print "Check ERROR";
print $result;
curl_close ($curl);
return $html;
}
$data = array('txtVendorNumber'=>'00000');
$html = getPage("http://www.acehardware-vendors.com/mainmenu.asp","","20","");
print $html;
$html = postPage("http://204.146.172.129/fwauthredirect204.146.172.129id0000249726", $data, "", "");
print $html;
/
$html = testPage();
print $html;
/
?>
...."
Any help or suggestions would be greatly appreciated.