The linear execution of a program by a processor is unnatural. It was the way in which the first computers of von Neumann and Turing operated, in part because the resources needed to create even a single processor were enormous. It has continued to be the way in which computers operate just because “it is the way in which computers work.” Well perhaps that is a bit unfair, but if we train thousands of people to think in ways that allow them to develop systems that execute linearly, then it is very hard to get them to think in other ways.
But the world is inherently parallel. In any society, company, or other organisation, people get on with their own areas of activity and communicate with others as needed. So why is there this fuss about how difficult it is to build systems that will execute on multiple processors? And, even more surprisingly, why is this fuss an issue in the embedded world, where we have been building systems incorporating high degrees of parallelism for decades? The different elements of parallel execution were not necessarily executing on a multi-core processing element, or even on multiple copies of the same processor architecture, but all the issues that are currently the subject of prolonged anguished debate have already been resolved within products that are shipping in millions.
Some of the problems in implementing on a multi-core are related to the architecture chosen, particularly to how the processors and their applications communicate. Intel’s decision to communicate through cache memory, for example, has been questioned, with the biggest question mark over how this will scale to tens or even hundreds of cores. Being Intel, the company will probably find ways of solving that problem. But the real problem is people: people are conservative, and, once trained in a particular mode of thinking, they are reluctant to change. A special case of this is the way in which programmers will cling tenaciously to the language in which they have been trained.
We are now seeing an explosion of concern over programming parallel systems, triggered in part by Intel’s very public embrace of multi-core as the way to achieve improved processor performance at acceptable levels of power consumption. Alongside this has been a rash of new companies launching new multiple-processor architectures. So people are looking for better ways to develop new systems and to convert existing or legacy software to these new systems.
Taking the legacy problem first. Next week, my colleague, Bryon Moyer, will be looking at how a Dutch-based company, Vector Fabrics, has developed powerful tools for analysing existing code and parallelising it to run as multiple threads. Other people are attempting this as well, but all of them run into Amdahl’s law, which says that, while you might get significant improvements to the speed of executing code through parallelising, the size of the biggest sequential section that remains after the parallelisation process is complete is the limit to the overall runtime improvement.
This week, I am more interested in how to develop new software. So far, the main effort has been devoted to adding parallel constructs to existing languages, and, in the embedded space, this has normally been C. This raises some interesting issues, as the C language is already sprawling and messy. Normally, the route taken has been to create a sub-set of C and then add the required constructs that divide the language into sections that can execute in parallel and provide for the communication between them. But when you start looking at these languages, it is clear that many are designed to run on large machines, even super-computers, and they are not appropriate for an embedded environment. (And Handel C was designed for generating hardware design language and is still in use for FPGA programming.)
OK, so where do we go from here? Tucker Taft, of AdaCore, has no doubts. He thinks we need to consider a new language, and he has been working on one – which he calls ParaSail.
To give this some context – Taft has been around the block. Nearly forty years ago he was the system programmer on the first UNIX machine in the wild, at Harvard University. He has spent much of the intervening time working on Ada, including the rather mind-warping task of building an Ada compiler in Ada. Compilers need to do more than just compile, and he began building in analysis tools that help to get improved optimisation. These tools also highlighted code issues, and he looked at ways to show these to the programmers. In the end, he abandoned the compiling stuff and concentrated on the code analysis, producing, within a company he founded called SofCheck, a set of static code analysis tools for Java and Ada.
One of his licensees was AdaCore, which incorporated SofCheck Inspector into its CodePeer tool, and which, early in 2012, bought the company, making Taft the Director of Language Research. He took with him his work on ParaSail (Parallel Specification and Implementation Language).
His reason for working on ParaSail is that he saw that it would soon be necessary to develop systems to run on hundreds of cores. He argues that today’s languages are not just going to find it difficult to extend to cope with this level of parallelism, but that constructs within many languages make it certain that they won’t be at all suitable. His particular list of parallelisation breakers includes:
Garbage-Collected Global Heap
Run-Time Exception Handling
and worst of all …
If you are having a meal with Taft and want to catch up on eating, just ask him what is wrong with pointers in a parallel environment. Plenty of time to eat and listen. So ParaSail excludes pointers as well as all the other evil language constructs on his list.
(Catch up on these arguments on the ParaSail blog which also links through to a download of the language with libraries etc.)
ParaSail is designed to look familiar to anyone used to working with something like Java or C++. It uses an object-oriented model (and replaces the pointers with expandable and shrinkable objects). But the big initial difference is that it requires the programmer to work hard to write sequential code, the default mode being parallel.
The compiler is obviously an integral part of the language, and Taft’s experience with developing compilers and code analysis tools contributes to the compiler’s error checking and debugging. It is, in effect, a high-powered and specialised code analyser as well as a compiler, and it looks for, and eliminates, race conditions and other run-time error conditions, such as using null values or uninitialized data, indexing outside an array, numeric overflow, dangling references, etc. The compiler also creates hundreds of micro-threads, which can be assigned to the multiple processors/cores for execution.
Obviously Taft is a great enthusiast for ParaSail; it is his creation. It’s up to you to carry out a closer examination to see if it is likely to meet your parallel processing requirements. But whatever the future for ParaSail, programmers are going to have to bite the bullet and accept that if they are going to efficiently develop programs for embedded applications that deploy massively parallel processing, they are going to require a new language. There will be a rear-guard action of further attempts to force existing languages into a parallel processing paradigm, but these are going to be inefficient and will probably require intensive de-bugging, for which the tools do not yet exist. The big fear is that the recognition of this need is going to spark an explosion of new languages, muddying the waters still further and starting yet more software wars of religion.