Tuesday, May 29, 2007

Why Events are a Bad Idea (for high concurrency servers)

And Why Mixing Events and Threads is Even Worse

I've just been badly burned by mixing two paradigms - threading and events. All was working fine until the day before go live; the testers started to pour 400 stock baskets through the system rather than 40 which resulted in one of those issues which make your heart sink as a programmer. Between 5 and 15 stocks would go into pending which meant that there was a threading issues somewhere and it was go live that evening. Tracking it down was to prove difficult due to poor separation of duties between the threads resulting from the design having its origins as a single threaded, serial set of calls which took data from an event generated on one side and generated a modified message dispatched to the receiver and vice versa. In retrospect, the problem would have been solved by having single queue between the two threads, following the asynchronous put/take connector pattern. This would have ensured complete separation and higher throughput.

Mutex Spaghetti

As it happens, it was impossible in the time available, to redesign the solution and the simplest course of action was to go back, reluctantly, to a single threaded implementation. Time to test was a major factor here, however, not before some time was spent playing the mutex game. I started mutexing at a very low granularity which appeared to fix the issue (or break things completely) almost - I was down to one or two trades pending - which was not good enough.

Adding additional mutexes, it quickly became apparent that it's very easy to get in a mess either by blocking on an already locked mutex or by adding too many mutexes which means you end up in a mess anyway. After four hours of mutex soup - I made the decision to remediate the code back to single-threaded code. Performance, at the moment is not an issue.

So the moral of the story is look for a design pattern which fits your code - understand it and think the design through, we seldom have time, but if you can, use UML to build a sequence diagram - then you'll see the call chain and understand conflict between threads.

A friend of mine related the story of Sun's attempts to make the solaris kernel mt safe - this was a much harder task than they anticipated. Most of their effort was centred around protecting the key data structures rather and changing the programming paradigm so that users took the responsibility for data allocation.

Another pointed out an article on slashdot this morning "
Is Parallel Programming Just Too Hard?" which raises some concerns which, as we can see from the above, seem to be valid. If you're going to write parallel threads, you have to spend time on the design - it takes three times the effort and you really need to use patterns, example code and sequence diagrams. I hacked it and got away with it - to a point. If performance proves an issue, and you can bet it will at some stage soon, then it will be back to the drawing board - and this time, the design will come first.

Interestingly, threading versus events is an interesting debate which this paper: "Why Events Are A Bad Idea (for high concurrency servers)" argues well and threading comprehensively wins the day as a paradigm over events. I've always been of the opinion that separation of duties via a thread is intuitively faster than event dispatch - this paper goes a way to prove my intuition right.

Finally, there's a well deserved mention for Functional Programming Languages such as Erlang and Haskell
, which has much promise for multi-core programming as outlined in these excellent slides: Data Parallel for Haskell.


Eric B. said...

I think fresh young programmers will become much better at writing multithreaded apps as time goes on. Electrical and computer engineers who write FPGA code in VHDL and similar languages have (and have had for quite some time) a big advantage in thinking about parallel design and being able to hold state for several "threads" in their heads at once; every instruction in the program happens more or less all at once on every clock cycle on an FPGA. My guess is that once parallelism is emphasized and taught in school from the very first introductory programming class onward, the next crop of programmers won't have quite so much difficulty with it.

Iang said...

For all of these reasons, I have always preferred a single threaded design and only broken up any major threads into processes. The emphasis then is on very carefully sequencing the steps in what is effectively one thread, and achieving performance benefits by other tricks. The benefit is that one can map all the activity into the standard 20cm head, including the thread issues.

I find that once multi-threading is added to the grey mix, 20cm is not enough, and other important issues slip out on onto the floor, unnoticed.