Hi all,
I am a PHP programmer and have been given the following task to be done. I am a newbie and finding it a bit difficult to answer it. Could you all PHP guru's please help me out.
1. The log file is a microscopic slice of a single hours archive on just one of the web servers. It represents the tiniest slice of our activity for even a single day.
Sample log data:
203.166.246.232 - - [12/Jul/2002:00:00:09 +1000] "GET /themes/standard/images/pm-topright.gif HTTP/1.1" 200 493 "http://public.planetmirror.com/pub/opera/win/802/en/ow32enen802.exe" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"
65.214.44.33 - - [12/Jul/2002:00:00:10 +1000] "GET /pub/sourceforge/p/pe/perl-mvc/ HTTP/1.0" 200 21294 "-" "Mozilla/2.0 (compatible; Ask Jeeves/Teoma; +http://sp.ask.com/docs/about/tech_crawling.html)"
etc etc...
2. The other zip is the IP-to-country database (info here: http://www.ip2location.com/README-IP-COUNTRY.htm )
Sample data:
33996344 33996351 GB GBR UNITED KINGDOM
50331648 69956103 US USA UNITED STATES
3. Write a PHP driven process that slurps the log and drives it into a MySQL database (the schema is completely up to me)
4. Write a PHP report that shows:
a. The top 100 most frequently requested files, total downloads for that file, total bytes delivered, and average delivery rate for that file.
b. The common 400-series errors by file and frequency (top 50 files)
c. The top 100 most active countries, their total downloads, and their average delivery speed in kb/sec (across all files)
I would really really be greatful to all for help.
Cheers.