The great concurrency challenge

Just finished watching a great video discussion between George Chrysanthakopoulos and Erik Meijer from Microsoft about Concurrency, Coordination, and the CCR. George and I share a common past. We worked together for a few years in an incubation project in Advanced Strategies under Craig Mundie, Microsoft’s CTO. That project lead to the creation of the CCR (Concurrency and Coordination Runtime) and was also an inspiration to MindTouch Dream. From firsthand experience, I can attest that George is just as passionate and animated in person as he is in the video. He starts off slow and works up to an epic crescendo with arms flailing, expletives flying, and fists pounding… steamrolling his interlocutor into the ground. Very entertaining stuff!
To get to the point of this post, George and Erik talk about the problem of concurrency and how to address it. Concurrency is an extremely difficult problem in part because we all have learned to build software the wrong way. For example, one of the benefits of object-oriented programming is encapsulation. That is, data is hidden behind methods and properties. However, for distributed/concurrent systems, data hiding is the enemy, because everything revolves around data: operations act upon data and make new data available. Since operations don’t have internal state, they can be used concurrently already. Consequently, the only impediment to concurrency is the lack of simultaneously available data that can be operated on in isolation.
Another problem is fault tolerance. In a distributed world, bad things happen, but they aren’t necessarily permanent. In order to make robust software, code must be written to expect failure and always provide a response. For long running applications, this also means being able to get back to a steady state after a perturbation occurs. Failure recovery is another difficult problem that becomes easier to solve if data isolation is used, but again most programmers are taught to hide data behind interfaces.
The last problem is asynchronous programming. I don’t think this is a huge issue, because programming languages and compilers will be able to take care of it well enough. The most common error in asynchronous programming is to forget to provide a response which blocks all downstream code. This problem is easy to solve and we already did so in Dream. Arne wrote an excellent blog post about asynchronicity and coroutines a few months back. Writing asynchronous code in C# is pretty ugly today, but there are tricks to make it a bit better (especially in C# 3.0, which Dream doesn’t yet target). The bigger problem is that asynchronicity must go all the way down the stack to pay off. For example, some network streams in Mono are synchronous, so it doesn’t matter how asynchronous your code is if it relies on these streams. It simply won’t help. Ditto for library code that uses locks or other exclusion mechanisms. Asynchroncity is an all-or-nothing sum game, which makes it impossible for projects to adopt incrementally.
I’m very excited about what can be done to tackle concurrency and enable highly scalable systems. Message passing, such as used in Dream and CCR, already hold parts of the solution. Dream is released under Apache License 2.0 and is used as the foundation for Deki and its services. Dream uses message passing with support for streams and provides an asynchronous programming library, but is specialized for HTTP. I wish George would open-source the CCR as well. I’m pretty sure we would take advantage of it. Unfortunately, commercialization is taking precedence over adoption, which I think is a big loss to everyone. George, if you’re reading this, please set the CCR free! (preferably under an Apache 2.0 license). ;)

Good points. I think languages like Erlang are tackling some of these design issues; but there is still some ways to go.
February 13th, 2009 at 4:37 pmErlang, which is also message passing based, has the right elements. That’s also evidenced in the kind of scalable, distributed applications that are being built with it. However, where Erlang is failing, is in hiding complexity. Let me explain. If you look at an Erlang program, you will see that many share common patterns. Why is it that the programmer must know this pattern? Why not enforce it by the language itself. For example, the failure to send a message before finishing execution?
February 14th, 2009 at 8:27 amErlang made the choice to provide more freedom at the cost of hiding complexity. There is much to be learned from Erlang and there is also opportunity is doing new mistakes and learning from them.
Steve,
While I agree with most of what you write about concurrency, I don’t agree that object-oriented design, encapsulation and abstraction follow as failed paradigms in a world of concurrency. I agree that object interfaces do not work as concurrency boundaries, since the language constructs in use assume synchronous call execution, but neither do I think that structuring every operation as potential concurrent call is wise choice either. Yes, crossing concurrency boundaries with objects has the same problem as with any other data, i.e. shared state is bad. But that’s not a reflection on OO.
February 16th, 2009 at 5:37 pmThe CCR is a great piece of work and really allows the developer to push the .net platform about as hard as it can go. Like you say, it has to be used in tandem with truly asynchronous i/o, but providing this is available, you can get some quite outstanding throughput. I’d agree that the C# code is still not as natural as it should be and in some respects, F# shows the way forward here, but i’d definitely trade some syntactic inconvenience for improved syntax and semantics.
February 17th, 2009 at 2:45 am[...] The Great Concurrency Challenge [...]
February 26th, 2009 at 4:27 pm