Posts tagged ‘multicore’

More than 10 years ago I participated in the development of an University IT system (the front- and backend to maintain grades and that sort of stuff). The system was based on a DB/2 backend (a very nice database system) with the business code stored on a Prolog interpreter (Prolog interpreter which was in-house developed) and the web backend being a Java servlet engine (the old JServ, the thingy pre-Tomcat from Apache). Prolog is famed to be slow, and Java (at that point in time) was very slow. Surprise, surprise… the bottleneck was on the DB/2 server. Eventually, as the system grow (and the database hardware was beefed up) the bottleneck come forward to the business and web tiers, but the problem was sorted by just adding more machines: The contention was on a bunch of parallel independent process, they could be run on separate machines.

The example above illustrates why the concurrency problem posed by multiple core CPUs and GPUs, might not be that much important:

  1. Many problems are not CPU bound anyway, and even if they are, the bottleneck might be elsewhere. Another example: I am the proud owner of 3 cheap, slow laptops (one being a netbook). For my use case I really don’t need faster applications, I wonder how many users really need more than they already have?
  2. Even if more CPU/GPU power is needed, a loosely coupled model (without much interprocess communication and contention issues) might be enough. This is typically the case of many web apps, which can scale by just adding more computers which run independent processes.

Concurrency, even with modern abstractions, is hard. It should be avoided if possible and it can be avoided in many applications. If it cannot be avoided, maybe a loosely coupled model is enough… Guido van Rossum has a nice take on this issue.

This is important as concurrency is being touted as an important criteria to evaluate languages. Modern functional languages (think Scala and Clojure) are being touted as a better option precisely because they are better to do concurrency (both because of functional – “no changing state” – programming and the availability of libraries implementing nice concurrency paradigms like actors).

When addressing this importance of this issue, I would propose, that people would ask themselves this: “Am I developing computationally intensive software?” and “If I am developing computationally intensive software, can I live with loosely coupled models of computation, preferably processes with no shared memory?”

This is not to say that there are not some cases where tightly coupled computing is a good idea. It is just that, this complex solution might be an overkill for many problems.

I would just like to add that I am not defending my cause, in fact it is quite the opposite. There is actually some content produced here, in the past, on how to tackle concurrent programming:

  1. LOSITAN – A multicore-aware Jython-based (Python for the JVM) Web Start application to do selection detection.
  2. An introductory tutorial on concurrent computing targeting computational biologists – Part 1, 2 and 3

It is interesting to see how different people tackle the ongoing multicore (and GPU) software “revolution”. There are strong philosophical differences on how to develop for these new concurrent architectures. Lets start with the extremes.

The most interesting extreme comes from Guido van Rossum (aka Python benevolent dictator for life): He suggests that if you want to use the available processing power of multiple cores you should have separated processes, let me quote:

[...] doesn’t mean that multiple processes (with judicious use of IPC) aren’t a much better approach to writing apps for multi-CPU boxes than threads.

Just Say No to the combined evils of locking, deadlocks, lock granularity, livelocks, nondeterminism and race conditions.

Some similar arguments are made by the message passing crowd, which seems to be quite happy with a model based on explicit message passing between separated processes.

The fundamental idea here is that shared memory between parallel computing threads can lead to a lot of grief and sorrow, thus is is better if all the data memory space is the sole propriety of a single thread. Communication occurs in a explicit form (e.g., message passing among executing code) between threads that do not share anything (other than messages).

The opposite idea can be found on the typical C/C++/Fortran, lower-level crowd: One single process, many threads, a single memory space shared among threads with concurrent access controlled through a low level mechanism like semaphores. This seems also to be the underlying idea of the OpenMP system. These folks believe that programmers can tackle parallel complexity easily (well, at least it is not an impossible, daunting task according to this philosophy).

The point of contention comes from the fact that multiple execution flows introduce a completely new class of bugs coming from the need to coordinate a lot of things going on in parallel. The worst problem introduced is non-determinism: You can execute the same program twice, WITH THE SAME INPUT and get different results. Why? Because the different threads/processes will be scheduled in unpredicted ways by the operating system (or virtual machine) which can yield different results. This severely increases the difficulty to test and debug software. The shared memory crowd (the shared memory model is more efficient and flexible as, well, memory is directly shared) will say that we can deal with this. The message passing crowd suggests that having some restrictions and explicit communication will make life easier (or, less complicated).

The Java crowd is where you can find the most variety of opinions, but the core JVM and Java language itself seems to follow the C/C++ philosophy (though with some candy thrown in, like the Fork/Join framework). But on top of that you can find everything with a vocal support community: Tuple spaces, Map/Reduce, Message passing, etc. This is not to say that the Python and C/C++ communities are monolithic (they are not! Just check the C implementations of MPI and PVM), but you really can find a lot alternatives with vibrant communities on top of the JVM.

A sort of middle of the ground approach was introduced de facto with the programming language Erlang: Erlang allows for multiple threads, but the communication is shared-nothing and based on message passing. I.e. while there is one single process with multiple threads, there is no shared-memory per se and all inter-thread communication is based on message passing. This Actor model based language has influenced some recent language libraries in Scala, Groovy and Clojure, among others where the actor model is the main concurrent programming model.

Many functional languages (like Erlang, Scala and Clojure) proponents also suggest that mutability (ie, the concept of variable stemming from imperative languages like C, Java, C#, Basic, C++, 99% of used languages) is not easily amenable to parallel programming and suggest that immutable data structures make life much easier: If what is shared cannot be changed then much less bugs can be introduced.

To sum it up: Some people suggest concurrent programming is difficult and it is better to minimize communication to tackle that difficulty. Others suggest that concurrent programming is workable and tightly-coupled memory-sharing systems are OK. Some also suggest (functional crowd) that immutable data structures help.

Further reading:
Concurrent computing (Wikipedia)
Scala actors – My preferred introduction to Actors (which happens to be based on Scala)
Erlang Concurrency Message passing (Wikipedia)

My opinion: Shared memory models are for real men! I am just a regular bloke, so I stick with message passing models. The complexity of bugs introduced by concurrent programming is much much worse compared to the existing sequential paradigm. In most of the cases that I have encountered, the restrictions imposed by message passing are acceptable compared to the benefits. Even with message passing and immutable data structures, concurrent programming is still very hard and bug prone (non-determinism is still quite possible with message passing). I expect (hope) that new R&D will allow us to tame this complexity. Avoid shared memory/tightly coupled systems like the plague!