I'm working on a real-time hit counter project. I would like to know the number of times that each page has been displayed, and then record the number into MySQL database. Another script is going to extract the stats from the database and then show the result (counting hits) for each page to the users.
Currently I'm using a SSI CGI written in C working with flat-file database, and keeping track of the hits for each page in separate files (page1.html with a data file page1.stats storing number of hit access).
The problem with the above method is that it's resource intensive, and the stats sometimes got reset to 0 when too many users are accessing at the same time (for some reasons, the file lock seems not working properly).
After getting some suggestion from people in PHPBuilder, I've eliminated my choices down to the following three:
using a separate hits tracker stats.php?page1.html to update the hits in the database and the send visitor to page1.html
calling stats.php?page1.html within page1.html via <script> or <img> tags (to avoid server parsing as my current SSI).
analyze the server access log in a 5 minute interval via cron job to simulate real-time tracking (which i'm not quite sure how to do so at this moment)
There is about 500 pages that I'm going to track, and a total of at least 1.5 million page access per day to these pages.
The server is currently receiving over 5 million hits per day, and the size of the log file is considerably large (with log rotation 4 times a day).
My major concern right now is to minimize the usage of server resource. The server is equipped with dual 750 CPU and 2gb of RAM.
Can anyone give me a better suggestion of implementation?