So I've had some great luck using [man]pcntl[/man] and [man]posix[/man] functions in command-line scripts. I thought recently "gee wouldn't it be great if you could [man]pcntl_fork[/man] off a process from a script hosted by apache to maybe perform that super-long process like converting a video file."

So I concocted a test script and ran it via cli a couple of times but when I tried to access it via apache, this error happens:

if (!extension_loaded("posix")){
	throw new Exception("This script requires posix extension");
}

And looking in /etc/php5/apache2/php.ini, I see this:

; This directive allows you to disable certain functions for security reasons.
; It receives a comma-delimited list of function names. This directive is
; *NOT* affected by whether Safe Mode is turned On or Off.
; http://php.net/disable-functions

disable_functions = pcntl_alarm,pcntl_fork,pcntl_waitpid,pcntl_wait,pcntl_wifexited,pcntl_wifstopped,pcntl_wifsignaled,pcntl_wexitstatus,pcntl_wtermsig,pcntl_wstopsig,pcntl_signal,pcntl_signal_dispatch,pcntl_get_last_error,pcntl_strerror,pcntl_sigprocmask,pcntl_sigwaitinfo,pcntl_sigtimedwait,pcntl_exec,pcntl_getpriority,pcntl_setpriority,

It seems odd to me that one would define disable_functions to exclude pcntl functions when this module does not appear to be even installed/loaded. Also, I can't seem to find anything that would explicitly load pcntl functions in the cli php.ini file. Pretty weird.

Anyways, the big question is it kosher to utilize pcntl_fork and related functions when running a PHP script hosted by apache? Obviously the ubuntu package managers didn't want it running. I found some mostly confused/confusing discussion about how to get it running but also some suggestions that this may not be a good idea. I saw another suggestion to use ignore_user_abort but that doesn't sound like a good idea to me -- I wouldn't want my pool of PHP processes available to apache to get used up processing video or calculating pi to the millionth decimal only to starve apache of these processes.

I have managed in the past to fork off a CLI PHP process from apache like so:

$cmd = "/usr/bin/php /path/to/some/script.php [parameter] > /dev/null & echo \$!";
$cmd_output = NULL; // will contain an array of output lines from the exec command
$cmd_result = NULL; // will contain the success/failure code returned by the OS. Will be zero for a valid command, non-zero otherwise
$cmd_return = exec ( $cmd, $cmd_output, $cmd_result ); // $return will contain the last line from the result of the command which should be the PID of the process we have spawned

and then utilized pcntl_fork and/or [man]posix_setsid[/man] to liberate and divorce the cli process from its cruel apache master.

But that seems like a lot of work to me. I believe I had to do it this way a) because apache doesn't have pcntl enabled and b) just calling exec is not enough because the exec'ed process can get killed when the browser disconnects or a timeout happens or whatever.

    Hmm, would system() et al work for that? Perhaps most important, is it outward facing, public, etc.?

    I dunno if I have objections to allowing Apache to do it for secure, intranet type stuff anyway. I'm looking at my "big project" in this field and that's what it does. I do have some reasonable, pre-tested limits on how many processes I fork ....

    Incidentally, the PHP script is called via Ajax and the page just sits there and waits for it to tell the JS it's finished....

      dalecosp;11051281 wrote:

      Hmm, would system() et al work for that?

      If you read the code in my post, you see I'm using [man]exec[/man] to run a separate process. I have the vaguest recollection that this process would be terminated once the script that initiated it had completed -- or sometime shortly after. So the script that gets exec'ed has to call posix_setsid at least and in my working example, it also uses pcntl_fork. The exec'ed script can do this because it runs via CLI. The job of these forked processes will be to take files uploaded to my server and transfer them to a cloud storage system / cdn.

      dalecosp;11051281 wrote:

      Perhaps most important, is it outward facing, public, etc.?

      Sort of. Users will upload files (images, maybe pdfs, perhaps other formats?). After some validation of the uploaded file to make sure it is in fact an image and/or a safe pdf, the uploaded file will be delivered into cloud storage. It's been our experience that allowing users to upload files takes long enough and that the transfer-to-cloud storage step is a bit too long. The idea is to make the forked process very robust such that it will try multiple times to get the file into the cloud before it gives up. Cloud storage can occasionally be flaky.

      dalecosp;11051281 wrote:

      I dunno if I have objections to allowing Apache to do it for secure, intranet type stuff anyway.

      This will NOT be deployed on an intranet. The forked script will be forked from a php script accessed directly be public visitors. The parameters under which the forked process runs should be pretty constrained, though. Comments on potential security problems welcome.

      dalecosp;11051281 wrote:

      I'm looking at my "big project" in this field and that's what it does.

      Not sure what you mean here?

      dalecosp;11051281 wrote:

      I do have some reasonable, pre-tested limits on how many processes I fork ....

      I do suppose if I were to fork too many of these processes, my server might be vulnerable to some kind of DDOS. I am aware that every machine has its limits. I recall in the past tweaking a multi-processing program extensively to find its optimum process count.

      dalecosp;11051281 wrote:

      Incidentally, the PHP script is called via Ajax and the page just sits there and waits for it to tell the JS it's finished....

      I'm not sure we perfectly understand one another. In my application, you visit a website, you upload a file, the file is thoroughly inspected and then a long-running external process is forked that will never return any data to the user. The original PHP script the user visited in the first place will terminate and the forked process must continue to run, even if the user closes their browser and smashes their computer over Justin Beiber's head.

        Why not a process that LOOKs for things to do. In fashion you could create /jobs/uploads/{pid}_{timestamp}.json (or something) from the apache server, and a daemon (or cron or whatever) would come along, see a new upload to deal with and process accordingly. I imagine something like the following (super psuedo, but you're smart enough to deal)

        // apache.upload.process.php
        if validate_file_upload()
            move_uploaded_file($_FILES["field"]["tmp_name"], '/jobs/uploads/files/'. time() . basename($_FILES["pictures"]["name"])) 
            file_put_contents(
                '/jobs/uploads/'. getmypid() .'_'. time() .'.json',
                json_encode([
                    'type' => 'upload',
                    'file' => '/jobs/uploads/files/'. time() . basename($_FILES["pictures"]["name"])
                    'user_id' => // etc, and other required information
               ])
            );
        
        
        // job processor
        $files = glob('/jobs/uploads/*.json')
        foreach $files as $file
            json_decode(file_get_contents($file), true);
            // move file to CDNs
            unlink($file);
        
        

          Thanks for your suggestion!

          Derokorian;11051285 wrote:

          Why not a process that LOOKs for things to do. In fashion you could create /jobs/uploads/{pid}_{timestamp}.json (or something) from the apache server, and a daemon (or cron or whatever) would come along, see a new upload to deal with and process accordingly. I imagine something like the following (super psuedo, but you're smart enough to deal)

          // apache.upload.process.php
          if validate_file_upload()
              move_uploaded_file($_FILES["field"]["tmp_name"], '/jobs/uploads/files/'. time() . basename($_FILES["pictures"]["name"])) 
              file_put_contents(
                  '/jobs/uploads/'. getmypid() .'_'. time() .'.json',
                  json_encode([
                      'type' => 'upload',
                      'file' => '/jobs/uploads/files/'. time() . basename($_FILES["pictures"]["name"])
                      'user_id' => // etc, and other required information
                 ])
              );
          
          
          // job processor
          $files = glob('/jobs/uploads/*.json')
          foreach $files as $file
              json_decode(file_get_contents($file), true);
              // move file to CDNs
              unlink($file);
          
          

          The reason this client came to us in the first place was because they had an import process that ran every night to download remote images and copy them to the server's web root (creating various thumbnail sizes, etc.). The number of images they were dealing with became so great that the script (which dealt with one image at a time) would still be running 24 hours later when it's daily scheduled cron time came.

          Your proposal of having a single process to be waiting for the images to be uploaded could, in theory, suffer from a similar problem. Relying on a JSON-encoded file further complicates matters (at least I think it does) in that it might prove to be a bottleneck on a busy server. If the daemon script or cron job is working on the JSON file, then the web server cannot simultaneously work on that file so we might have either a bottleneck or some kind of weird race conditions. I dunno. Maybe not. In any case, we have found that using a database for the job queue is waaaaay better -- transactions and record locking mean it is well-suited to many processes running at once.

          So imagine you have 50 customers at once come rushing over to the site, all uploading large sensitive files (e.g., 10MB apiece). Like maybe a passport or driver's license or something. Once process might take quite some time to sequentially load all those files into a CDN. On the other hand, 50 distinct processes all uploading files at once might totally hose the server's bandwidth.

          Using a cron job is also a possibility, but I think this makes it even more likely that we'd see situations where we need to do a bunch of files at once.

          I think the ideal situation is where we have a pool of processes waiting/running all the time, using a database to handle things. But that's a pain in the ass to code. Long-running processes are really hard to do. Memory leaks, unexpected fatal errors, etc., tend to kill your process pool or make zombie processes. You also have to be careful about locking your db records and how you dole out the workload without jobs getting dropped or skipped or permanently locked and never unlocked. You generally need some kind of cron job or something to kill and restart these things or supervise them to make sure they are healthy.

          Anyways, my initial thoughts on pseudo were something like this:

          // file upload handler
          if validate_file_upload()
            move_uploaded_file($_FILES["field"]["tmp_name"], '/jobs/uploads/files/'. time() . basename($_FILES["pictures"]["name"]));
            // create db record in our jobs table
            $db_id = insert_cdn_job_record("blah blah blah");
            $cmd = "/usr/bin/php /path/to/cdn-upload.php $db_id /dev/null & echo \$!";
            $log->write ( "command for terminate:" . $cmd );
            $cmd_output = NULL; // will contain an array of output lines from the exec command
            $cmd_result = NULL; // will contain the success/failure code returned by the OS. Will be zero for a valid command, non-zero otherwise
            $cmd_return = exec ( $cmd, $cmd_output, $cmd_result ); // $return will contain the last line from the result of the command which should be the PID of the process we have spawned
            if ($cmd_result) {
              die("uh oh there was a problem with cdn-upload script!");
            }
          }
          

          And then the cdn-upload script is designed to run via CLI so it can happily use pcntl_fork and fork off an entirely independent process:

          // cdn-upload.php
          try {
            $db_id = $argv [1];
            if (!preg_match('^/[0-9]+/$', $db_id)) {
              throw new Exception("Invalid id!");
            }
            $log->write ( "image daemon launch called new_server_name=" . $new_server_name );
          
            $pid = pcntl_fork (); // fork
            if ($pid < 0) {
              throw new Exception ( "Fork failed! Negative value returned attempting db upload $db_id" );
            } else if ($pid) { // parent
              $log->write ( "image daemon launched script forked off with child pid=" . $pid );
              exit(); // parent process launched via CLI just dies because it would likely be bound to the apache process
            } else { // child
              // check the db for $db_id and handle CDN upload
            }
          } catch (Exception $e) {
            // notify someone that $db_id failed?
            // try and re-queue the job?
          }
          

          In addition to this immediate-response cdn-upload process, we might also have a cron job to periodically check for failed uploads.

            OK so, having RTFM, I see that:

            TFM wrote:

            Process Control support in PHP implements the Unix style of process creation, program execution, signal handling and process termination. Process Control should not be enabled within a web server environment and unexpected results may happen if any Process Control functions are used within a web server environment.

            .
            So I think that very obviously precludes Apache. I wonder if my PCNTL approach is still safe?

              The user comments section also has this statement, which I find a bit suspect:

              sean dot kelly at mediatile dot com wrote:

              The following statement left me searching for answers for about a day before I finally clued in:

              "Process Control should not be enabled within a web server environment and unexpected results may happen if any Process Control functions are used within a web server environment."

              At least for PHP 5.3.8 which I am using, and who knows how far back, it's not a matter of "should not", it's "can not". Even though I have compiled in PCNTL with --enable-pcntl, it turns out that it only compiles in to the CLI version of PHP, not the Apache module. As a result, I spent many hours trying to track down why function_exists('pcntl_fork') was returning false even though it compiled correctly. It turns out it returns true just fine from the CLI, and only returns false for HTTP requests. The same is true of ALL of the pcntl_*() functions.

                This test may be of interest to others. I created a test script to be accessed via Apache, all it does is exec another php script in the background. This seemed to work fine:
                apache-test.php is accessed in a browser. it forks off cli-test.php and completes
                cli-test.php runs for at least nine minutes or so running its loop

                The two processes seemed distinct and disconnected. However, I got to wondering about how apache (prefork? fastcgi?) might treat a pool of processes. My test is working fine on my workstation, but what happens on a production server with frequent page requests and a busy php process pool? I tried the test again and restarted apache while cli-test.php was running its 10-minute loop. This killed cli-test.php (in fact all instances of it that might be running).

                //apache-test.php
                echo "preparing to fork cli-test<br>";
                
                $script_path = dirname(__FILE__) . "/cli-test.php";
                
                $cmd = "/usr/bin/php $script_path > /dev/null & echo \$!";
                echo "command for terminate:" . $cmd . "<br>";
                $cmd_output = NULL; // will contain an array of output lines from the exec command
                $cmd_result = NULL; // will contain the success/failure code returned by the OS. Will be zero for a valid command, non-zero otherwise
                $cmd_return = exec ( $cmd, $cmd_output, $cmd_result ); // $return will contain the last line from the result of the command which should be the PID of the process we have spawned
                
                echo "cmd_result<br>";
                var_dump($cmd_result);
                echo "<br><br>";
                
                echo "cmd_output<br>";
                var_dump($cmd_output);
                echo "<br><br>";
                
                echo "cmd_return<br>";
                var_dump($cmd_return);

                the output:

                hello we are running
                preparing to fork cli-test
                command:/usr/bin/php /var/www/example/html/erp_php/cloud-files/cli-test.php > /dev/null & echo $!
                cmd_result
                int(0)
                
                cmd_output
                array(1) { [0]=> string(5) "30590" }
                
                cmd_return
                string(5) "30590" 

                Here's cli-test.php:

                <?php
                /**
                 * This is a test file to determine if it's possible to execute an independent process from apache
                 * or if the process will be coupled to the original apache process and terminated when the other
                 * process terminates
                 */
                
                try {
                	// need this for file path constants, etc.
                	require_once "configure.php";
                
                if (!extension_loaded("posix")){
                	throw new Exception("This script requires posix extension");
                }
                
                // a db log object is one option
                $log = new Log(PRIVATE_LOG_PATH . "/cloud-files/fork.log");
                
                $mypid = posix_getpid();
                $log->write($mypid . ": running " . __FILE__);
                
                
                // create a long-running process to see if it'll keep up or whether this process is
                // terminated and garbage collected when the apache process that launches it dies
                for($i=0; $i<600; $i++) {
                	$log->write($mypid . " running $i th iteration");
                	sleep(1);
                }
                
                $log->write("run complete!");
                
                } catch (Exception $e) {
                	// TODO: send notification? write log file? die with some statement or will the output simply disappear?
                	die("exception caught: " . $e);
                }

                Sample log output:

                [2015-10-07 12:20:44] - 30590: running /var/www/example/html/erp_php/cloud-files/cli-test.php
                [2015-10-07 12:20:44] - 30590 running 0 th iteration
                [2015-10-07 12:20:45] - 30590 running 1 th iteration
                [2015-10-07 12:20:46] - 30590 running 2 th iteration
                [2015-10-07 12:20:47] - 30590 running 3 th iteration
                [2015-10-07 12:20:48] - 30590 running 4 th iteration
                [2015-10-07 12:20:49] - 30590 running 5 th iteration
                
                

                And, as I mentioned, restarting apache kills any instance of cli-test.php launched by apache-test.php

                  Soooo apparently calling [man]posix_setsid[/man] in cli-test.php is enough to make that process survive beyond the apache restart. I'm not certain, but I'm inclined to think that this process is therefore entirely independent of apache and therefore not susceptible to any process pool management or memory reclamation work that apache might do. I could be wrong about this. Comments more than welcome.

                  The modified script (which is still running despite 3 apache restarts):

                  try {
                  	// need this for file path constants, etc.
                  	require_once "configure.php";
                  
                  if (!extension_loaded("posix")){
                  	throw new Exception("This script requires posix extension");
                  }
                  
                  // a db log object is one option
                  $log = new Log(PRIVATE_LOG_PATH . "/cloud-files/fork.log");
                  
                  
                  // process id of this process
                  $mypid = posix_getpid();
                  $log->write($mypid . ": running " . __FILE__);
                  
                  $log->write($mypid . ": attempting setsid");
                  // session id -- not to be confused with php session id
                  // apparently this call will make this process a 'session leader'
                  $sid = posix_setsid();
                  if ($sid < 0) {
                  	throw new Exception("setsid failed, returned $sid");
                  } 
                  $log->write("setsid success, sid=$sid");
                  
                  
                  // create a long-running process to see if it'll keep up or whether this process is
                  // terminated and garbage collected when the apache process that launches it dies
                  for($i=0; $i<600; $i++) {
                  	$log->write($mypid . " running $i th iteration");
                  	sleep(1);
                  }
                  
                  $log->write("run complete!");
                  
                  } catch (Exception $e) {
                  	// TODO: send notification? write log file? die with some statement or will the output simply disappear?
                  	die("exception caught: " . $e);
                  }
                  

                  and here's what the process looks like after the setsid call:

                  USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
                  www-data 31006  0.0  0.2 387144 15684 ?        Ss   10:43   0:00 /usr/bin/php /var/www/example/html/erp_php/cloud-files/cli-test.php
                  
                    a year later

                    Was revisiting this concept today and thought I'd add some additional information.

                    First, using the ampersand to background the process means you don't get the non-zero return_var if the PHP script throws an error. To demonstrate, I created this bad.php which throws an exception:

                    echo "so far so good BUT...\n";
                    throw new Exception("grrr! I am bad!");

                    Then I wrote this script, exec.php to execute it:

                    // example 1
                    $cmd = "/usr/bin/php /tmp/foo/bad.php > /tmp/foo/out.txt";
                    $cmd_output = NULL;
                    $cmd_result = NULL;
                    $cmd_return = exec($cmd, $cmd_output, $cmd_result); // $return will contain the last line from the result of the command which should be the PID of the process we have spawned
                    
                    echo "=== CMD_RETURN ===\n";
                    var_dump($cmd_return);
                    echo "=== CMD_RESULT ===\n";
                    var_dump($cmd_result);
                    echo "=== CMD_OUTPUT ===\n";
                    var_dump($cmd_output);
                    echo "=== END OUTPUT ===\n";
                    

                    Running exec.php from the command line yields this:

                    PHP Fatal error:  Uncaught exception 'Exception' with message 'grrr! I am bad!' in /tmp/foo/bad.php:3
                    Stack trace:
                    #0 {main}
                      thrown in /tmp/foo/bad.php on line 3
                    === CMD_RETURN ===
                    string(0) ""
                    === CMD_RESULT ===
                    int(255)
                    === CMD_OUTPUT ===
                    array(0) {
                    }
                    === END OUTPUT ===
                    

                    Note how $cmd_result has the integer value of 255. Command line scripts return zero when they run properly and non-zero values otherwise.

                    If you add the ampersand to background this process, then $cmd_result is zero even if bad.php encounters a fatal error. This:

                    // example 2
                    $cmd = "/usr/bin/php /tmp/foo/bad.php > /tmp/foo/out.txt &";
                    

                    Yields this:

                    === CMD_RETURN ===
                    string(0) ""
                    === CMD_RESULT ===
                    int(0)
                    === CMD_OUTPUT ===
                    array(0) {
                    }
                    === END OUTPUT ===
                    sneakyimp@sneakyimp-ubuntu-14:/var/www/html$ PHP Fatal error:  Uncaught exception 'Exception' with message 'grrr! I am bad!' in /tmp/foo/bad.php:3
                    Stack trace:
                    #0 {main}
                      thrown in /tmp/foo/bad.php on line 3
                    

                    This presumably returns the successful zero $cmd_result value because we successfully launched an independent PHP process. Note also that the exception thrown comes AFTER the output of exec.php has finished and the command prompt has again been displayed. This is presumably because stderr was still being routed to our main process (exec.php) and, because we forked the process, it was somewhat delayed or something. I.e., the error in bad.php was routed to stdErr which was still somehow piped to exec.php and subsequently bubbled back up to the terminal window from which we invoked everything. The exception is not part of the output returned to $cmd_output. The text of an exception thrown in bad.php will not be available in exec.php.

                    Adding the echo $! at the end is how we get the PID of the forked process.

                    // example 3
                    $cmd = "/usr/bin/php /tmp/foo/bad.php > /tmp/foo/out.txt & echo \$!";

                    the result:

                    === CMD_RETURN ===
                    string(4) "3036"
                    === CMD_RESULT ===
                    int(0)
                    === CMD_OUTPUT ===
                    array(1) {
                      [0] =>
                      string(4) "3036"
                    }
                    === END OUTPUT ===
                    sneakyimp@sneakyimp-ubuntu-14:/var/www/html$ PHP Fatal error:  Uncaught exception 'Exception' with message 'grrr! I am bad!' in /tmp/foo/bad.php:3
                    Stack trace:
                    #0 {main}
                    

                    This is helpful if we need to log the PIDs of processes we are launching just in case we need to forcibly terminate them later or otherwise check up on them or check log files or something.

                    And finally, I would recommend routing stderr to stdout when we run this other script:

                    // example 4
                    $cmd = "/usr/bin/php /tmp/foo/bad.php > /tmp/foo/out.txt 2>&1 & echo \$!";

                    This routes stderr into our file along with stdout and also effects a more complete separation of bad.php from exec.php. I'm not really sure what tenuous link might hang around between exec.php and bad.php, but it just seems better to try and separate them more fully. The result is that the exception does not appear when we run exec.php but it does appear in the output file, /tmp/foo/out.txt:

                    === CMD_RETURN ===
                    string(4) "3084"
                    === CMD_RESULT ===
                    int(0)
                    === CMD_OUTPUT ===
                    array(1) {
                      [0] =>
                      string(4) "3084"
                    }
                    === END OUTPUT ===
                    

                    Sadly, in that last variation, exec.php has NO IDEA if bad.php was successful or not. $cmd_return has the PID of the forked process, $cmd_result is zero, and $cmd_output also just has the PID.

                    I can't help but wonder where stdErr goes in examples 1-3 if we run exec.php via cron or apache?

                      Using ampersand is nothing to do with PHP, its a shell trick. Generally, you will redirect output and stderr to files in that case. As you determined, sorry I wrote that before finishing reading - but leaving it so you know that this is a piece of shell trickery and has nothing to do with php exec. Here's some light reading on the subject: [url]http://ba****out.com/2013/05/18/Ampersands-on-the-command-line.html[/url] and https://linux.die.net/man/1/bash

                      https://linux.die.net/man/1/bash wrote:

                      If a command is terminated by the control operator &, the shell executes the command in the background in a subshell. The shell does not wait for the command to finish, and the return status is 0.

                      The reason it waits for the error when you don't redirect stdErr is because you haven't disconnected all the stream handles. If you look at like [man]proc_open[/man] to manage this (which btw is a far more powerful way to do commands from PHP), you will see there are 3 streams associated with a shell command: stdIn, stdOut, stdErr. To fully detach the process with exec/shell/backticks you would have to (again, as you figured out) redirect stdOut and stdErr away from the firing shell environment.

                        Derokorian;11061241 wrote:

                        Using ampersand is nothing to do with PHP, its a shell trick. Generally, you will redirect output and stderr to files in that case. As you determined, sorry I wrote that before finishing reading - but leaving it so you know that this is a piece of shell trickery and has nothing to do with php exec. Here's some light reading on the subject: [url]http://ba****out.com/2013/05/18/Ampersands-on-the-command-line.html[/url] and https://linux.die.net/man/1/bash

                        The reason it waits for the error when you don't redirect stdErr is because you haven't disconnected all the stream handles. If you look at like [man]proc_open[/man] to manage this (which btw is a far more powerful way to do commands from PHP), you will see there are 3 streams associated with a shell command: stdIn, stdOut, stdErr. To fully detach the process with exec/shell/backticks you would have to (again, as you figured out) redirect stdOut and stdErr away from the firing shell environment.

                        Thanks for this detail, especially the 'return status is 0' bit and also the '3 streams' -- that explains a lot. I posted my results above as an attempt to clarify/share/remember where the stdErr and stdOut stuff go. I think it's quite noteworthy that an exception thrown in the script we have forked off is not available in the script that actually does the forking. The parent script has essentially no knowledge of the success or failure of any backgrounded process. If you want the parent process to know the success or failure of its child, you have to work out some other means of communicating between them.

                        I still think there's a bit of mystery about how separate these processes actually are. Some production code I wrote four years ago not only uses exec to fork off a separate process, the forked process also uses [man]posix_setsid[/man] to further disconnect itself from the parent. It also unsets-and-redefines some $log and $db vars and I can't remember why. I vaguely remember that I might have had some crosstalk between processes if they used the same db connection. I can't seem to locate any notes, however.

                          sneakyimp;11061245 wrote:

                          The parent script has essentially no knowledge of the success or failure of any backgrounded process. If you want the parent process to know the success or failure of its child, you have to work out some other means of communicating between them.

                          Or don't background it? I'm confused do you want it to run in the background, separate from the calling script - or do you want the calling script to know what's going on in the child? The point of forking it off into the background, is now you can terminate the parent while the child still runs.

                          sneakyimp;11061245 wrote:

                          I still think there's a bit of mystery about how separate these processes actually are. Some production code I wrote four years ago not only uses exec to fork off a separate process, the forked process also uses [man]posix_setsid[/man] to further disconnect itself from the parent. It also unsets-and-redefines some $log and $db vars and I can't remember why. I vaguely remember that I might have had some crosstalk between processes if they used the same db connection. I can't seem to locate any notes, however.

                          Again, you're confusing me - if you're running a shell command, you share no variables with the script executing the command vs the script running the command, so not sure why anything needs to be unset? Rather it's a new fresh script execution, with its own bootstrapping required.

                            Derokorian;11061249 wrote:

                            Or don't background it? I'm confused do you want it to run in the background, separate from the calling script - or do you want the calling script to know what's going on in the child?

                            Minimally, I'd like the parent process to know if the child process succeeds initially (i.e., no fatal errors in the first 10 milliseconds). If you fork off a process and have no idea what the forked process is doing, it makes it very hard to debug the forked process -- and you run the risk of having a bunch of poorly behaved scripts doing weird stuff. Even an obviously bad command will have a result of zero if you background it:

                            $cmd = "/this/script/does/not/exist &";
                            Derokorian;11061249 wrote:

                            The point of forking it off into the background, is now you can terminate the parent while the child still runs.

                            Not always. In my case, I want to delegate a time-consuming task full of lots of waiting (possibly ten minutes) to a separate process so my parent process can continue a loop which may need to launch still more servers. If I had to launch them all sequentially, it might take two hours to spin up 10 servers.

                            Derokorian;11061249 wrote:

                            Again, you're confusing me - if you're running a shell command, you share no variables with the script executing the command vs the script running the command, so not sure why anything needs to be unset? Rather it's a new fresh script execution, with its own bootstrapping required.

                            I was also confused/surprised to find that my production code from four years ago had the child process calling pcntl_fork. It seems unnecessary but I vaguely recall that I had problems keeping the child process running if I didn't. To summarize:
                            master.php is my script that wants to fork off processes
                            master.php calls exec on worker.php and backgrounds the process: /usr/bin/php /path/to/worker.php "some-param" > /dev/null 2>&1 & echo $!
                            worker.php calls pcntl_fork, the parent process exits immediately and the child process calls posix_setsid to become a 'session leader' and then does the work
                            the first thing the child process does is to unset any db-related variables and then create a new, distinct database connection. this is apparently necessary because if the parent process closes the database connection then the child process can no longer connect to the db. my parent process doesn't explicitly close the db connection, but I seem to recall having erratic db connection problems -- i think this was due to the parent process being garbage collected and its db connections getting closed.

                            I agree that it seems like a lot of effort to do all that. I don't recall why I went through all the trouble in the first place but I suspect it was to fix problems I was having with the production system. The code, although seemingly more complicated than it really needs to be, has been running steadily and reliably every five minutes for four years. When it has trouble, it's usually due to a third-party API (rackspace) acting up.

                              If you have a master script and you want to control child nodes, I would use proc_open. Far more flexible and much easier to facilitate communication between the two.

                                12 days later

                                Thanks for the suggestion to use [man]proc_open[/man]. I reckon I should get more familiar with file streams.

                                  Write a Reply...