multithreading: avoiding deadlock with multiple sync vars

sneakyimp · Jun 16, 2011

Thanks to a recommendation and extensive help from johanafm, I have a multithreaded PHP script running in the cloud which fetches images and stores them the cloud. It's fast and seems quite stable.

However, I believe I'm faced with a situation where I need to introduce an additional lock to my code so I'm faced with the task of managing multiple locks while avoiding deadlock.

I was wondering if anyone had experience managing multiple locks in a MT environment and could recommend specific functions and data structures. I've been reading up on resource management in MT code and understand a few things:
1) Any shared resource (global and static vars, memory, file descriptors, etc.) generally needs to be protected by some kind of mutex, lock, or other sync var.
2) There are certain necessary conditions for deadlock to occur.
3) You generally need a sequence or hierarchy of your resources so that all threads request them in the same order. This sequence must be the same and immutable for all participating threads, whether local or remote.
4) Recursion is a very common cause of deadlock, so a process should know what locks it has acquired previously in order to avoid blocking in the attempt to re-acquire those same locks again.
5) There are a variety of algorithms described such as the Banker's Algorithm, the Chandy/Misra solution, etc. to help avoid deadlock.

I'm hoping to come up with a technique for properly handling multiple locks/mutexes/sync vars that I can re-use in the future, but I'm still coming to grips with the algorithm descriptions and am unsure precisely what sorts of data structures or functions I'll need.

Does anyone have any advice/anecdotes/links/suggestions?

sneakyimp · Jun 18, 2011

In case anyone is remotely intersted, I invite you to take a look at the Dining Philosopher's Problem. This is the one for which the Chandy/Misra solution was proposed:

1) For every pair of philosophers contending for a resource, create a fork and give it to the philosopher with the lower ID. Each fork can either be dirty or clean. Initially, all forks are dirty.
2) When a philosopher wants to use a set of resources (i.e. eat), he must obtain the forks from his contending neighbors. For all such forks he does not have, he sends a request message.
3) When a philosopher with a fork receives a request message, he keeps the fork if it is clean, but gives it up when it is dirty. If he sends the fork over, he cleans the fork before doing so.
4) After a philosopher is done eating, all his forks become dirty. If another philosopher had previously requested one of the forks, he cleans the fork and sends it.

I'm wondering a few things about this solution:
1) Is it an accepted general solution for avoiding deadlock with multiple locks or is it just an accepted solution for the Dining Philosopher's Problem?
2) What does the "dirty fork" concept mean in code terms? I'm guessing we must keep some extra piece of data associated with each resource and find myself wondering if that extra piece of data requires a lock of its own.
3) It appears to require all kinds of inter-process communication which seems tricky.
4) Making sure all the processes give the same exact order to the resources would apparently require a well-ordered set which is immutable for all processes. I'm wondering if this has any tricks to it.

sneakyimp · Sep 15, 2015

bump

Surely there are some code nerds around here who might help me sort this one?

dalecosp · Sep 15, 2015

Well, sorry; all this noob's done in MT is make a bunch of processes, feed them all a heap of stack, and then wait 'til they all get done.

sneakyimp · Sep 16, 2015

dalecosp;11050789 wrote:
Well, sorry; all this noob's done in MT is make a bunch of processes, feed them all a heap of stack, and then wait 'til they all get done.

Wouldn't the heap/stack be a shared resource that should be protected by a lock?

dalecosp · Sep 16, 2015

No, just a list of URLs to fetch. Big array of URLs, chunk it by number of threads, give each process its list.

Generally speaking they're all gonna finish within a few seconds of each other. As the program was taking multiple hours* to complete, a few seconds either way was No Big Deal(tm).

(*Now running in 25% of the time it formerly took. )

sneakyimp · Sep 16, 2015

dalecosp;11050799 wrote:
No, just a list of URLs to fetch. Big array of URLs, chunk it by number of threads, give each process its list.

OK yes this seems to be one of the fundamental pieces of wisdom when dealing with MT situations: Things can be a lot simpler if you give a process all the data it needs to complete it s job when you fire it up in the first place.. If you have a single 'taskmaster' thread that is forking off the work to be done, you can completely avoid a situation where a swarm of threads is fighting over some stack of data to grab its workload.

dalecosp;11050799 wrote:
Generally speaking they're all gonna finish within a few seconds of each other. As the program was taking multiple hours* to complete, a few seconds either way was No Big Deal(tm).

Your usage example is almost identical to one that I had to deal with. In my case, I had to fetch a list of as many as several hundred thousand image urls to fetch. It was not practical to break out this entire batch at the outset because the likelihood of failure of some kind was too great for such a long-running job. I ended up using a MySQL db (and sql transactions) and having my child processes go fetch their own job queue from the db. I had to introduce some means of locking records to show that they were being worked on so that separate processes didn't try and work on the same records.

dalecosp;11050799 wrote:
(*Now running in 25% of the time it formerly took. )

YES YES YES. The magic of MT (and distributed processing) is amazing. Not only is my scheme MT, it is also multi-machine. In the bad old days, a daily cron job PHP script would still be running after 24 hours when it was supposed to start again the next day. Between the actual work to be done and the latency involved in serial http requests, it was still trying to choke down the 10,000 or so images it had found the prior day. I modified the script to make it spawn multiple processes to perform the image processing in parallel and sped the process up about 10 or 20 times. I then went further and modified things further so that another script would monitor the pending workload and would spin up new virtual machines in response to the number of pending jobs. The result is that I can now download 300,000 images (and generate about 2.4M thumbs of varying sizes) in about 8 hours. When there's nothing to be done, the monitor script spins down the unused virtual machines, saving gobs of money each month. I've attached a chart of the images to be processed (pending images) and a chart of the number of virtual machines (imagedaemons). If you consider what the inverse of the imagedaemon graph might look like, you can get an idea of how much money this saves each month versus having 10 servers running all the time.

[ATTACH]5247[/ATTACH][ATTACH]5249[/ATTACH]

I'm obviously quite excited about the success of this project and am grateful that my client was kind enough to let me spend all the time it required to get it working. I'm hoping to apply this MT magic to another project which involves a trickier situation. I need to spawn threads in a socket server to unserialize incoming messages from a socket and then queue them up on list of tasks. More broadly, I feel that being able to implement MT when one is dealing with multiple shared resources is an monumental skill that I'd like to have. I studied this in college and wish I had remembered the basics. Sadly, I must re-learn it.

pending-image-chart-screenshot.jpg image-daemon-charg-screenshot.jpg

dalecosp · Sep 16, 2015

sneakyimp;11050801 wrote:
OK yes this seems to be one of the fundamental pieces of wisdom when dealing with MT situations: Things can be a lot simpler if you give a process all the data it needs to complete it s job when you fire it up in the first place.. If you have a single 'taskmaster' thread that is forking off the work to be done, you can completely avoid a situation where a swarm of threads is fighting over some stack of data to grab its workload.

Your usage example is almost identical to one that I had to deal with. In my case, I had to fetch a list of as many as several hundred thousand image urls to fetch. It was not practical to break out this entire batch at the outset because the likelihood of failure of some kind was too great for such a long-running job. I ended up using a MySQL db (and sql transactions) and having my child processes go fetch their own job queue from the db. I had to introduce some means of locking records to show that they were being worked on so that separate processes didn't try and work on the same records.

YES YES YES. The magic of MT (and distributed processing) is amazing. Not only is my scheme MT, it is also multi-machine. In the bad old days, a daily cron job PHP script would still be running after 24 hours when it was supposed to start again the next day. Between the actual work to be done and the latency involved in serial http requests, it was still trying to choke down the 10,000 or so images it had found the prior day. I modified the script to make it spawn multiple processes to perform the image processing in parallel and sped the process up about 10 or 20 times. I then went further and modified things further so that another script would monitor the pending workload and would spin up new virtual machines in response to the number of pending jobs. The result is that I can now download 300,000 images (and generate about 2.4M thumbs of varying sizes) in about 8 hours. When there's nothing to be done, the monitor script spins down the unused virtual machines, saving gobs of money each month. I've attached a chart of the images to be processed (pending images) and a chart of the number of virtual machines (imagedaemons). If you consider what the inverse of the imagedaemon graph might look like, you can get an idea of how much money this saves each month versus having 10 servers running all the time.

[ATTACH]5247[/ATTACH][ATTACH]5249[/ATTACH]

I'm obviously quite excited about the success of this project and am grateful that my client was kind enough to let me spend all the time it required to get it working. I'm hoping to apply this MT magic to another project which involves a trickier situation. I need to spawn threads in a socket server to unserialize incoming messages from a socket and then queue them up on list of tasks. More broadly, I feel that being able to implement MT when one is dealing with multiple shared resources is an monumental skill that I'd like to have. I studied this in college and wish I had remembered the basics. Sadly, I must re-learn it.

That's a pretty awesome testimony there ... I was just glad to be able to install the pcntl packages and get it running in the first place!

IIRC, the list of URLs that the some-teen(?) threads operate on was about 20-30k. The real trick was balancing the number of threads, as all the files were on the same host and we didn't want to be seen as a "badly behaving" bot.

multithreading: avoiding deadlock with multiple sync vars

Ssneakyimp

Ssneakyimp

Ssneakyimp

dalecosp

Ssneakyimp

dalecosp

Ssneakyimp

dalecosp