Hey guys,

I'm currently developing a script that will handle massive data input
nightly into a database. As the amounts of data to be processed
nightly are large, I am creating my PHP script in such a way that it
calls itself over and over and handles the data in chunks. On a web browser it works like this: once one chunk is completed, it will redirect to myscript.php?step=2 and start the next step and so on. This works great but I do not think that redirects will work with the cron script. Any ideas how I could get this to work?

Thanks for any assistance

  • Shoeb

    you can write the script to use arguements from the comand line.

    lets say that argument 1 is the step number.

    ./myscript.php 2 would be the command. you would use $argc (argument count) to see if there are more than 1 (the first being the script name) and $argv (argument var as array) to get the argument.

    in this case $argv[0] = myscript.php and $argv[1] = 2

    so instead of checking for $step, you can check $argv[1].

    another option you have, instead of doing it in chunks is just to slow downt he whole process. I assume you are useing some sort of loop to put the data in to the DB, You can use usleep() or sleep() to put a pause between each entry.

    which ever tickles your pickle 🙂

    hope that helps.

      Thanks for the ideas.

      My issue with slowing down the whole process is that I'm afraid of timeout issues. Although I realize I can change the max_execution_time, I think it's far more reliable handling the data in chunks as it is now, though slowing down the process would be a last attempt 🙂

      For the use of arguments--how would I make a cron job to successively run the script from step = 0 to step = 104 (I have a set amount of steps). Would I have to write a shell script or is there an easier way?

      Thanks so much for your help

      • Shoeb Omar

        Just write one long script which does all the processing. Don't invoke it via the web, run it from CLI on cron (or something else).

        Because it's not in the web, this big script won't have a time limit.

        You may want this big script to do its own checkpointing so that if it is interrupted, on the next run it will continue from wherever it left off.

        And if there is any chance that it could be still running when the next cron job runs, you should also have a check in for that (hint: flock a file in non-blocking mode)

        Mark

          Write a Reply...