Hi, I need to scrape information from a website that requires ASPX login (the first page is a form which asks for user name and password). I am just wondering which method I should use to do this. Can I use cURL to fill in the form and send the login information to the server then access the page or can I login first to generate a cookie and then use that cookie to access the page (behind the login) I need to scrape from. The site I am trying to scrape is www.knowledge.reuters.com. Any help would be greatly appreciated! Thank you!!

    Well, in this case you would have to set the user agent to MSIE (Or something like "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)") -- You can do this by using PHP's function header();. Also, you will need to use curl, so make sure the server that you use the script with has the curl extension enabled, If not, a simple request could have it enabled.

      Write a Reply...