Thanks for your response (and the error codes link).

I am not sure at all whether the 'too many connections' error (which seems to be the most frequent error description by far) happens when a process initially tries to connect or whether the process has already performed some action using the database and is coming back to try another db transaction later after some time has elapsed. I suspect it is an initial connection failure in most cases but I also want to prevent the dropped connection as well. It's pretty difficult to diagnose exactly which is happening because of the forking of processes and the fact that many, many processes all write to the same file. Basically what happens is that I have N machines, each of which has one master process which connects to the database to lock-and-fetch 500 records in a database table. The master process then starts to fork off one new process to handle each locked record and that forked off process must make its own connection to the database to avoid any crosstalk between it and its parent process -- you can't have them sharing the same db connection resource.

As each forked-off child process works, it is downloading an image and then rendering thumbnails and then uploading them to a CDN -- a possibly time-consuming process involving lots of data crunching and possibly network latencies at various stages. It's a distinct possibility that the database connection might drop or time out or something between its initial connection and final database call.

    How many children are we talking about, here?

    The MySQL manual says that 150 connections is the default, but that a typical server should be able to handle 500-1000 (or many more, depending on system resources).

    (Aside - perhaps you could "simulate" a busy server by reducing max_connections to four or five, if you have a separate testing sever, that is...?)

      traq;11030525 wrote:

      How many children are we talking about, here?

      hundreds. easily 300

      traq;11030525 wrote:

      The MySQL manual says that 150 connections is the default, but that a typical server should be able to handle 500-1000 (or many more, depending on system resources).

      I'm on top of this and have increased it to 300 thanks to copious amounts of RAM being available, but I need my application to be better-behaved because these errors cause a real mess in my database which is hard to clean up. The recent increase to 300 has resulted in a very well-behaved system for the time being but I expect that in time we'll either need to add more of these processes or we'll experience greater user traffic --probably both. I can't have my processes dying when the connection attempt fails a single time because it leaves records lock in the database which never get unlocked.

      traq;11030525 wrote:

      (Aside - perhaps you could "simulate" a busy server by reducing max_connections to four or five, if you have a separate testing sever, that is...?)

      There are a variety of other reasons which make simulating the issue difficult. I realize that a proper simulation would be a good idea, but it would be costly and difficult and the client would likely not go for it.

        sneakyimp wrote:

        I can't have my processes dying when the connection attempt fails a single time because it leaves records lock in the database which never get unlocked.

        How do you unlock them when this happens? Maybe you could write a bash script and call it when you catch the exception.

        ....

        Do you have any idea how much time might transpire between the initial and final DB calls? If it's not too long, you could change the timeout. Alternatively, I found this comment on SO (the question wasn't really related, otherwise):

        IMHO, it is a better idea to have a smaller timeout and make multiple connections. Doing so will free up the connections for other processes.

        beyond this, I'm not sure what else to do ...never done anything like this.

          traq;11030565 wrote:

          How do you unlock them when this happens? Maybe you could write a bash script and call it when you catch the exception.

          records are locked by setting a couple of columns
          record_lock_microtime -- when the record was locked
          record_lock_name -- IP address / name of server which locked the record

          Under most circumstances, the process will finish processing the record and delete it OR it will unlock the record by contacting the database and setting those two columns to NULL. This unlocking is obviously not possible if you can't connect to the DB. delegating to a BASH script doesn't help if the BASH script can't connect either. this is why trying to connect a few times seems wise -- i'm willing to bet a little persistence will eliminate the vast majority of cases where the db connection failure causes a problem. Trying once just isn't persistent enough.

          My problem right now is to determine whether i need to destroy my PDO connection, close it, or just define a new PDO object. Given that [man]PDO [/man] has no ping function, i'm leaning toward just setting $this->db to NULL and then defining a new PDO object.

          traq;11030565 wrote:

          Do you have any idea how much time might transpire between the initial and final DB calls? If it's not too long, you could change the timeout. Alternatively, I found this comment on SO (the question wasn't really related, otherwise):

          It's likely to take anywhere from 5 to 30 seconds, depending on responsiveness of remote servers and time required to crunch the image data. I think increasing the timeout is a bad idea given that most processes are not actually using the connection but rather waiting on network latencies. In fact, it would likely make the 'too many connections' problem worse if every process is hoarding its db connection.

          traq;11030565 wrote:

          beyond this, I'm not sure what else to do ...never done anything like this.

          I appreciate your input, and mostly worry that my 'improved' function above is to too primitive -- I'm not sniffing the type of exception that might be throw in connecting, I'm not responding differently based on different types of exceptions, I'm not closing any PDO connections, I'm not checking first if there's an existing PDO resource which exists (and it not null) but nevertheless cannot connect. I'm wondering what happens to a PDO connection that has timed out. I'm guessing it's still a resource object but don't know of any quality about it that I might test which would let me know that it has timed out and I need to reconnect. I suspect that we'll only get an exception thrown on a timed-out PDO object when we try to run a query.

            sneakyimp;11030567 wrote:

            records are locked by setting a couple of columns
            record_lock_microtime -- when the record was locked
            record_lock_name -- IP address / name of server which locked the record

            Under most circumstances, the process will finish processing the record and delete it OR it will unlock the record by contacting the database and setting those two columns to NULL. This unlocking is obviously not possible if you can't connect to the DB. delegating to a BASH script doesn't help if the BASH script can't connect either. this is why trying to connect a few times seems wise -- i'm willing to bet a little persistence will eliminate the vast majority of cases where the db connection failure causes a problem. Trying once just isn't persistent enough.

            mm. Yeah, that's a difficult situation.

            sneakyimp;11030567 wrote:

            It's likely to take anywhere from 5 to 30 seconds, depending on responsiveness of remote servers and time required to crunch the image data. I think increasing the timeout is a bad idea given that most processes are not actually using the connection but rather waiting on network latencies. In fact, it would likely make the 'too many connections' problem worse if every process is hoarding its db connection.

            Agreed.

            sneakyimp;11030567 wrote:

            I'm not sniffing the type of exception that might be throw in connecting, I'm not responding differently based on different types of exceptions, I'm not closing any PDO connections, I'm not checking first if there's an existing PDO resource which exists (and it not null) but nevertheless cannot connect. I'm wondering what happens to a PDO connection that has timed out. I'm guessing it's still a resource object but don't know of any quality about it that I might test which would let me know that it has timed out and I need to reconnect. I suspect that we'll only get an exception thrown on a timed-out PDO object when we try to run a query.

            But you could be doing those things. 🙂

              traq;11030569 wrote:

              But you could be doing those things. 🙂

              sigh. I suppose.

              Consider this, though:
              How do you check to see if a PDO object is connected? As far as I can tell you can't. There is no ping function.
              How do you close a PDO Object? As far as I can see there is no close function either. Should I unset it? How do I know when it's time to unset it?

              As you can imagine, if I must attempt a query before I can tell if a PDO connection has timed out, my code might get considerably more complicated -- I'd have to put that try-try-again code wherever I want to call a query.

                maybe you could extend PDO with a ping() method. Try to [font=monospace]SELECT 1[/font], or something, and reconnect if you get an exception.

                I think a connection closes (on the PHP end, anyway) when the PDO object has no more references (so, assuming you've only assigned it to the one variable, just do [font=monospace]$PDO = null[/font]). I don't think you can explicitly close it on the MySQL side (so, another reason a long timeout is bad).

                  Are you by any chance using persistent connections? My experience suggests they're more likely to leave "zombie" processes hanging around and other similar problems, at least with MySQL.

                    AFAIK, I'm not "using" persistent connections. I just read a bit and see that the PDO constructor takes a 4th parameter that would let me specify PDO::ATTR_PERSISTENT = false but I am not doing this.

                    my server does permit persistent connections:

                    mysql.allow_persistent => On => On
                    mysql.max_persistent => Unlimited => Unlimited
                    mysqli.allow_persistent => On => On
                    mysqli.max_persistent => Unlimited => Unlimited

                    But the default behavior appears to be to use non-persistent connections:

                    $db = new PDO(EREP_DB_DSN, EREP_DB_USER, EREP_DB_PASS);
                    var_dump($db->getAttribute(PDO::ATTR_PERSISTENT)); // yields bool(false)
                    
                      6 days later

                      According to this page You can close a connection by setting the variable holding the object to something else, so I would assume unset works just as well. Note: A user comment on a separate page suggests that using a derived class and persistant connections does NOT work as of 5.4 see bug report

                        Write a Reply...