If you want to make programs go faster on parallel hardware, then you need some kind of concurrency. Right?
In this article I’d like to explain why the above statement is false, and why we should be very clear about the distinction between concurrency and parallelism. I should stress that these ideas are not mine, and are by no means new, but I think it’s important that this issue is well understood if we’re to find a way to enable everyday programmers to use multicore CPUs. I was moved to write about this after reading Tim Bray’s articles on Concur.next: while I agree with a lot of what’s said there, particularly statements like
Exposing real pre-emptive threading with shared mutable data structures to application programmers is wrong
it seems that parallelism and concurrency are still being conflated. Yes we need concurrency in our languages, but if all we want to do is make programs run faster on a multicore, concurrency should be a last resort.
First, I’ll try to establish the terminology.
A concurrent program is one with multiple threads of control. Each thread of control has effects on the world, and those threads are interleaved in some arbitrary way by the scheduler. We say that a concurrent programming language is non-deterministic, because the total effect of the program may depend on the particular interleaving at runtime. The programmer has the tricky task of controlling this non-determinism using synchronisation, to make sure that the program ends up doing what it was supposed to do regardless of the scheduling order. And that’s no mean feat, because there’s no reasonable way to test that you have covered all the cases. This is regardless of what synchronisation technology you’re using: yes, STM is better than locks, and message passing has its advantages, but all of these are just ways to communicate between threads in a non-deterministic language.
A parallel program, on the other hand, is one that merely runs on multiple processors, with the goal of hopefully running faster than it would on a single CPU.
So where did this dangerous assumption that Parallelism == Concurrency come from? It’s a natural consequence of languages with side-effects: when your language has side-effects everywhere, then any time you try to do more than one thing at a time you essentially have non-determinism caused by the interleaving of the effects from each operation. So in side-effecty languages, the only way to get parallelism is concurrency; it’s therefore not surprising that we often see the two conflated.
However, in a side-effect-free language, you are free to run different parts of the program at the same time without observing any difference in the result. This is one reason that our salvation lies in programming languages with controlled side-effects. The way forward for those side-effecty languages is to start being more explicit about the effects, so that the effect-free parts can be identified and exploited.
It pains me to see Haskell’s concurrency compared against the concurrency support in other languages, when the goal is simply to make use of multicore CPUs (Edit: Ted followed up with a clarification). It’s missing the point: yes of course Haskell has the best concurrency support :-), but for this problem domain it has something even better: deterministic parallelism. In Haskell you can use multicore CPUs without getting your hands dirty with concurrency and non-determinism, without having to get the synchronisation right, and with a guarantee that the parallel program gives the same answer every time, just more quickly.
There are two facets to Haskell’s determinstic parallelism support:
- par/pseq and Strategies. These give you a way to add parallelism to an existing program, usually without requiring much restructuring. For instance, there’s a parallel version of ‘map’. Support for this kind of parallelism is maturing with the soon to be released GHC 6.12.1, where we made some significant performance improvements over previous versions.
- Nested Data Parallelism. This is for taking advantage of parallelism in algorithms that are best expressed by composing operations on (possibly nested) arrays. The compiler takes care of flattening the array structure, fusing array operations, and dividing the work amongst the available CPUs. Data-Parallel Haskell will let us take advantage of GPUs and many-core machines for large-scale data-parallelism in the future. Right now, DPH support in GHC is experimental, but work on it continues.
That’s not to say that concurrency doesn’t have its place. So when should you use concurrency? Concurrency is most useful as a method for structuring a program that needs to communicate with multiple external clients simultaneously, or respond to multiple asynchronous inputs. It’s perfect for a GUI that needs to respond to user input while talking to a database and updating the display at the same time, for a network application that talks to multiple clients simultaneously, or a program that communicates with multiple hardware devices, for example. Concurrency lets you structure the program as if each individual communication is a sequential task, or a thread, and in these kinds of settings it’s often the ideal abstraction. STM is vitally important for making this kind of programming more tractable.
As luck would have it, we can run concurrent programs in parallel without changing their semantics. However, concurrent programs are often not compute-bound, so there’s not a great deal to be gained by actually running them in parallel, except perhaps for lower latency.
Having said all this, there is some overlap between concurrency and parallelism. Some algorithms use multiple threads for parallelism deliberately; for example, search-type problems in which multiple threads search branches of a problem space, where knowledge gained in one branch may be exploited in other concurrent searches. SAT-solvers and game-playing algorithms are good examples. An open problem is how to incorporate this kind of non-deterministic parallelism in a safe way: in Haskell these algorithms would end up in the IO monad, despite the fact that the result could be deterministic. Still, I believe these kinds of problems are in the minority, and we can get a long way with purely deterministic parallelism.
You’ll be glad to know that with GHC you can freely mix parallelism and concurrency on multicore CPUs to your heart’s content. Knock yourself out 🙂
Pingback: edythemighty's status on Wednesday, 07-Oct-09 03:18:58 UTC - Identi.ca
You seem to be unaware of the fact that the parallel programming literature has been using the term “concurrency” in a way that is synonymous with “parallel.” I think your definitions of these terms are not what many practitioners of parallel programming agree with.
I’m afraid you need to look up what parallelism means. Pretty much everything in your article is wrong.
First off, concurrency is NOT necessarily one with multiple threads. Using the floating point unit at the same time you’re using the ALU is concurrency and there’s no need for multiple threads. Heck, sending something off to the video card to be processed while setting up the next frame on the CPU (like updating positions and whatnot) is also an example of concurrency. Again, no need for multiple threads (though it does use multiple processors).
Having said that, you should know that concurrency is a superset of parallelism.
Second, parallelism does not necessarily run on multiple processors. Parallelism is simply executing the same operations at the same time (normally on different data, but not required). Vector processing is a form of parallelism. Here, there is only one processor and it’s parallel processing. Your video card supports parallel processing when executing the same pixel shaders on all pixels (amongst other things).
With the rest of your article, there’s just so much wrong. You don’t even qualify what is deterministic or not. Is it the execution profile, or the data, or that you can tell what can be executed in parallel (or concurrently) from the source code? What? Give the reader a hint.
Also, if you use some kind of command to force something to run in parallel, then it’s no longer different than using threads or whatever else as far as whatever definition of determinism you’re using. Where do you set the boundary? One command is ok, but creating your own command that spawns a thread isn’t? How does that work? Just because the par command is built-in and the custom thread isn’t? That doesn’t pass the smell test.
I could go on and on, but this should suffice.
Addressing your second-to-last paragraph, shouldn’t it be possible to create a parallel version of the ST monad? I imagine that it would support a “forkST” operation and semaphores and/or an MVar-like variable.
Library authors could then create DSLs that would transparently support parallelism without leaning on the IO monad.
Nice article explaining the difference between Concurrency and Parallelism.
Pingback: Concurrency => Parallelism at Ted Leung on the Air
One comment: you say that a possible advantage of running concurrent processes in parallel is lower latency, but ironically, when you have more concurrency than execution units, making them all parallel *introduces* latency, from the context switching that occurs in the OS kernel. In this case, you’re better off with asynchronous operations, ideally in a continuation passing style, where when I/O operations complete, the next continuation gets added to a work stealing queue, and the async I/O callback thread(s) is free to handle the next async I/O completion.
It seems to me you’re twisting your own definitions of concurrency and parallelism to support your own views. It reminds me much of the flame wars when UML was maturing arguing about the difference between “type” and “class”
> I’m afraid you need to look up what parallelism means. Pretty much everything in your article is wrong.
The point of this article is that the clear distinction between two particular notions (here called “concurrency” and “parallelism”) is useful. That the name given to those notions aren’t the generally accepted ones is irrelevant. don’t base the rest of your rebuttal on that.
> Using the floating point unit at the same time you’re using the ALU is concurrency and there’s no need for multiple threads.
What you described is parallelism (according to the only definition that matters here, namely the one presented in the article). Well, this is on one processor only here, but having different processes going on simultaneously while remaining completely separate and transparently guaranteeing a given result is definitely in the spirit of parallelism as defined here. And so are many other kind of SMID, be they on one chip on in zillions.
You do have a small point though: the definition of “processor” may have been more clearly defined.
That was my first point. Going to the second, I suggest you read the article more carefully:
> We say that a concurrent programming language is non-deterministic, because the total effect of the program may depend on the particular interleaving at runtime.
Can you still say with a straight face that “You don’t even qualify what is deterministic or not.”? By the way, execution profile and data are irrelevant by themselves. These are the “how”, not the “what”. The only “what” anyone care about any program, is the observable effects of that program. If the observable effect of any possible program depends on the interleaving, the program is concurrent. Otherwise, it is at most parallel. This is again according to the definitions of this article.
Finally, in the hope of lifting this confusion of yours:
> Also, if you use some kind of command to force something to run in parallel, then it’s no longer different than using threads or whatever else as far as whatever definition of determinism you’re using. Where do you set the boundary?
The boundary is determinism. If you use a mechanism which guarantees you that the result won’t be dependant on the execution order of the relevant parts of your program, you are doing parallelism. An example is a parallel map function to which you give a pure function as parameter. (In the unlikely possibility that you don’t understand my last sentence, go learn Haskell.)
Pingback: Rich Wareham (richwareham) 's status on Wednesday, 07-Oct-09 12:01:07 UTC - Identi.ca
Pingback: Concurrency Can Be Deterministic (But The Type System Doesn’t Know It) « Communicating Haskell Processes
It seems to me you’re twisting your own definitions of concurrency and parallelism to support your own views. (+1)
Contrary to other commenters this is exactly how I have been tought and how did I teach about parallelism and concurency. I found your words enlightening and translated them to Polish language here: http://gracjanpolak.wordpress.com/2009/10/07/rownoleglosc-wspolbieznosc/. I hope you don’t mind 🙂
As has been said, this article is redefining terms moreso than explaining real usage.
You do touch on an important point that I agree with though. I think it is a mistake how concurrency is introduced to students — with the focus on mutexes, semaphores, etc… I believe it is better practice to carefully minimize required interactions upfront in the design phase so that processes can run on their own without the need to share common data or interfere with one another. Then this can occasionally be brought together. (this isn’t always possible – but often is to a large extent)
Pingback: Równoległość /= współbieżność « Haskell. Po polsku.
> As has been said, this article is redefining terms
> moreso than explaining real usage.
This terminology is pretty standard in the programming languages community. If you read the wikipedia articles carefully they do say the same thing:
although the latter article unfortunately implies that parallel programs have to be concurrent, which is clearly not true.
Thanks also to mbrubeck over on Y combinator who pointed out this reference:
I’m not saying this terminology is ubiquitous – not at all, I’m well aware that many people will be more familiar with the use of the term “concurrent” to mean the same as “parallel”. Nonetheless, I do think that making a distinction between these concepts is crucial. If you don’t agree with the terminology then fine; that’s not the point of the article. Feel free to apply your own alpha renaming.
In support of the argument, I’d like to add yet another formulation.
Concurrency means that non-determinism is the desired feature and the purpose is to model inherently concurrent communication patterns like Dijkstra’s dining philosophers.
Parallelism means that determinism is the desired feature and the purpose is to speed up ordinary computations like multiplying matrices.
Of course, parallelism can be implemented with concurrency, but that’s not what it is about.
Very logical way by you
Before you start wandering off into the weeds, you really need on consider SQL, which is by far the most widely used language for both parallelism and concurrency.
After reading the comments here and at ycombinator(hackernews) Im more confused than before. Maybe this is part of the reason why thinking multicore is hard for us humans.
Simon’s definitions are spot on. It’s true that many people don’t properly understand the distinction between concurrency and parallelism — unfortunately, also some people who should know better. Referring to vague and imprecise definitions in the literature doesn’t change that fact.
That’s why I like this blog post so much. It sets the record straight!
Isn’t dataflow concurrency, as in Oz, deterministic?
I agree that it’s important to separate concurrency and parallelism (as you define them). I don’t necessarily agree with the statement that STM is “vitally important” to make concurrent programming more tractable. If you refer to concurrent programming with shared data, I am more inclined to agree, but with share-nothing message-passing concurrency, as e.g. in Erlang, it is demonstrably rather straightforward to build very large and complex concurrent systems without STM.
> I don’t necessarily agree with the statement that STM
> is “vitally important” to make concurrent programming
> more tractable. If you refer to concurrent
> programming with shared data, I am more inclined to
> agree, but with share-nothing message-passing
> concurrency, as e.g. in Erlang, it is demonstrably
> rather straightforward to build very large and
> complex concurrent systems without STM.
A fair point – where I said STM I should really have included message passing and perhaps some of the other less problematic concurrency abstractions such as Actors.
The definitions of parallelism and concurrency given here and in section 7.18 of the GHC user manual are very implementation specific. If we ignore how to execute these ideas, then a parallel program is a special case of concurrent programming where you fork a pure function for each element of a data structure and block until you receive all the results. Frankly, I don’t see much conceptual difference between par/seq and fork & block on MVars.
There are special cases of concurrent programs that can be implemented as coroutines or some other CPS single-threaded pattern. There are special cases of parallel programs that can be implemented as SSE instructions or distributed MapReduce programs. So let’s emphasize the abstraction and not worry so much about the implementation.
I agree with you that it would be great if languages supported implicit parallelism. And Haskell seems to be pretty much the purest thing around, so in a lot of ways it’s well-positioned to do this.
However, if this is really the goal, then I think the goal should be that a program shouldn’t require any annotation (so no “par” or data-parallel type annotations.) People still need strictness annotations, though, so maybe I’m dreaming…
Pingback: Concurrency « Programmagic!
The title should be multiple threads /= multiple processors. While the ideas are correct, the words definitions open to many interpretation
Great Article … Although u did renew the definitions .. u did state a obvious fact, there is a clear distinction between the two terms Parallelism & Concurrency … i think in the world of modern multicore computing we need new bold definitions for computing and parallelism. The article is not wrong in any way, What many purists in the comments above failed to acknowledge was the fact, that there is a lot of difference between the way programs are structured and executed in a parallel vs concurrent fashion. Instead they were fighting over getting the definitions right first.
I appreciate the distinction between parallelism and concurrency. But I’d be a bit happier our experts did not equate ‘concurrency’ to ‘indeterministic interleave’. (We ought to consider concurrent constraint programming, temporal logic programming, functional reactive programming, etc.)
Pingback: What are some good resources for learning about distributed computing? - Quora
Pingback: Why Haskell – 20 reasons | Scrive Labs