It is interesting to see how different people tackle the ongoing multicore (and GPU) software “revolution”. There are strong philosophical differences on how to develop for these new concurrent architectures. Lets start with the extremes.
The most interesting extreme comes from Guido van Rossum (aka Python benevolent dictator for life): He suggests that if you want to use the available processing power of multiple cores you should have separated processes, let me quote:
[...] doesn’t mean that multiple processes (with judicious use of IPC) aren’t a much better approach to writing apps for multi-CPU boxes than threads.
Just Say No to the combined evils of locking, deadlocks, lock granularity, livelocks, nondeterminism and race conditions.
Some similar arguments are made by the message passing crowd, which seems to be quite happy with a model based on explicit message passing between separated processes.
The fundamental idea here is that shared memory between parallel computing threads can lead to a lot of grief and sorrow, thus is is better if all the data memory space is the sole propriety of a single thread. Communication occurs in a explicit form (e.g., message passing among executing code) between threads that do not share anything (other than messages).
The opposite idea can be found on the typical C/C++/Fortran, lower-level crowd: One single process, many threads, a single memory space shared among threads with concurrent access controlled through a low level mechanism like semaphores. This seems also to be the underlying idea of the OpenMP system. These folks believe that programmers can tackle parallel complexity easily (well, at least it is not an impossible, daunting task according to this philosophy).
The point of contention comes from the fact that multiple execution flows introduce a completely new class of bugs coming from the need to coordinate a lot of things going on in parallel. The worst problem introduced is non-determinism: You can execute the same program twice, WITH THE SAME INPUT and get different results. Why? Because the different threads/processes will be scheduled in unpredicted ways by the operating system (or virtual machine) which can yield different results. This severely increases the difficulty to test and debug software. The shared memory crowd (the shared memory model is more efficient and flexible as, well, memory is directly shared) will say that we can deal with this. The message passing crowd suggests that having some restrictions and explicit communication will make life easier (or, less complicated).
The Java crowd is where you can find the most variety of opinions, but the core JVM and Java language itself seems to follow the C/C++ philosophy (though with some candy thrown in, like the Fork/Join framework). But on top of that you can find everything with a vocal support community: Tuple spaces, Map/Reduce, Message passing, etc. This is not to say that the Python and C/C++ communities are monolithic (they are not! Just check the C implementations of MPI and PVM), but you really can find a lot alternatives with vibrant communities on top of the JVM.
A sort of middle of the ground approach was introduced de facto with the programming language Erlang: Erlang allows for multiple threads, but the communication is shared-nothing and based on message passing. I.e. while there is one single process with multiple threads, there is no shared-memory per se and all inter-thread communication is based on message passing. This Actor model based language has influenced some recent language libraries in Scala, Groovy and Clojure, among others where the actor model is the main concurrent programming model.
Many functional languages (like Erlang, Scala and Clojure) proponents also suggest that mutability (ie, the concept of variable stemming from imperative languages like C, Java, C#, Basic, C++, 99% of used languages) is not easily amenable to parallel programming and suggest that immutable data structures make life much easier: If what is shared cannot be changed then much less bugs can be introduced.
To sum it up: Some people suggest concurrent programming is difficult and it is better to minimize communication to tackle that difficulty. Others suggest that concurrent programming is workable and tightly-coupled memory-sharing systems are OK. Some also suggest (functional crowd) that immutable data structures help.
Further reading:
Concurrent computing (Wikipedia)
Scala actors – My preferred introduction to Actors (which happens to be based on Scala)
Erlang Concurrency Message passing (Wikipedia)
My opinion: Shared memory models are for real men! I am just a regular bloke, so I stick with message passing models. The complexity of bugs introduced by concurrent programming is much much worse compared to the existing sequential paradigm. In most of the cases that I have encountered, the restrictions imposed by message passing are acceptable compared to the benefits. Even with message passing and immutable data structures, concurrent programming is still very hard and bug prone (non-determinism is still quite possible with message passing). I expect (hope) that new R&D will allow us to tame this complexity. Avoid shared memory/tightly coupled systems like the plague!