feature article
Subscribe Now

Parallel Processing Considered Not Harmful

Parallel processing has long been held to be the way to achieve high processing throughput at a reasonable cost. Yet there are still few generally available systems, and it is seen as being difficult to do. European Editor Dick Selwood spoke to the pioneer of parallel processing, Iann Barron, about the issues.

Thirty years ago Iann Barron was one of the founders of the British semiconductor company, INMOS. The company was set up for several reasons, but one of them was to commercialise parallel processing, using a specialised device called the transputer.  Inmos has now disappeared, the remains of its technology being held by ST, but its legacy is still around in many different companies and in a general acceptance of parallel processing.

In an ideal parallel processing environment, tasks are spread across an array of processors, and adding more processors increases the performance. For some applications, the array can carry out the same instruction at the same time (Single Instruction, Multiple Data – SIMD), rather like soldiers drilling. However, most applications in the real world require Multiple Instructions, Multiple Data (MIMD), where each processor is carrying out an individual task and communicating with other processors as needed. It is this that is the target of much work on parallelism, and it is implementing software to run on these systems that is seen as the barrier to widespread adoption of parallel processing systems.

Iann Barron has been thinking about parallel systems for over thirty years and has published widely. Embedded Technology Journal spoke to him about the current issues.

ETJ: Why is parallel processing important?

Iann Barron (IMB): Existing techniques to improve processor performance have reached or passed their limit (a processor’s silicon is already mostly cache RAM), and yet the technology allows even larger processor chips to be built. Parallel processing is the way that performance can continue to improve.

ETJ: How far have people moved on in adopting parallelism in the last thirty years?

IMB: They haven’t: if anything they are behind where we were thirty years ago.

ETJ: But we have seen things like Intel’s dual core entering the main stream.

IMB: Yes, but Intel has hit the wall with the quad core. All they are doing is multi-threading on several processors rather than just one; there isn’t enough inherent parallelism in Windows and current user applications to keep even four processors usefully busy.

The inexorable increase in the capability of silicon means that we can build ever larger processors; however, the current techniques for increasing the performance of a processor long ago ran into the buffers (already the silicon area of processor is mostly memory), and now no one has any good idea what to do, except to increase the number of processors on a chip.

Now that is a good idea, except that to use more processors means rewriting applications, and that makes Intel vulnerable unless they have a very clever strategy to move the market into multi-processor applications.

ETJ: Is the Intel Quad Xeon a good model for future parallel processing?

IMB: The Xeon is an impossibly over-complicated processor, which is almost certainly not an effective use of silicon. Simpler processors, which are properly designed to support parallel processing, will prove more cost effective, and if Intel does not produce them, one of its competitors will – and that is another strategic problem for Intel.

ETJ: What does it take to design a proper parallel processor?

IMB: The key thing is that there must be a close correspondence between the hardware model of the processor and the software model of a process – the basic parallel task unit. In particular, the communications model between processors needs to correspond to the communications model between processes.

This is because it will be necessary to map the same application onto different parallel configurations (how many processors will there be in your pc?), so the computational overhead caused by mapping a set of processes (the application) into a specific set of processors (the computer) must be kept low.

The transputer got this exactly right. A single sequential processor could support many processes, being able to swap between processes in the equivalent of 10 instructions, while the communication between processes was identical, whether they were running on the same processor or different processors.

ETJ: So all we need is a new processor designed for parallel processing?

IMB: No – you have to understand that you can’t take existing programs and run them through a software black box to make them into parallel programs. The act of creating a serial program means that the information about the parallel aspects of an application has been lost, and there is no easy way to get it back. So today’s software is largely incompatible with future parallel processing.

ETJ: What applications lend themselves to parallelism?

IMB: The obvious class is graphics. Indeed the latest graphics chips already support some parallel operations. As a result, some of today’s supercomputers are built from graphics chips, even though they might not be ideal.

A lot of heavy scientific applications also have easily exploitable parallelism, because they are array based.

And, of course, operating systems are a natural for parallelising, since an operating system runs lots of activities that should, ideally, be taking place at the same time, instead of competing for processing time on a single processor.

But beyond that, most applications are not readily mappable onto a regular array.

ETJ: In that case, how do you do it?

IMB: After writing the application with explicit parallelism, you need to run a new type of program – a mapper – that maps the program onto a specific set of processors, analyses the interactions, and then optimises the mapping for that particular configuration. The mapper must do this dynamically, without too much cost in processing and execution time. In this way you can write the program and only then worry about the mapping into the hardware.

ETJ: Does each application have to be linked to specific hardware?

IMB: No, the mapping program provides the link. What is necessary is that the mapping does not introduce undue software overhead, which is why I said that there must be a close correspondence between the hardware and software models for a process.

ETJ: Should we expect new languages for parallel applications?

IMB: The sensible approach is to extend the existing languages with parallel constructs. There is too much invested in software to be able to re-write applications in an entirely new language. Adobe, for example, doesn’t want to completely re-write Photoshop to exploit parallelism. Instead it should be possible to write new segments that take advantage of parallelism while still being able to communicate with existing parts of the program that are still sequential. In this way, an application can be transformed into a parallel version over several market releases. What this means is adding parallel constructs to existing languages, such as C or Java.

This approach also has the advantage that, by adding parallelism to an existing language, there is a large resource of programmers who understand that language and need to learn only the new constructs or instructions.

ETJ: If it is as simple as you suggest, why are people still saying that exploiting parallelism is difficult?

IMB: It is mainly because people haven’t thought through the issues properly.

Parallelism isn’t difficult, it is just that people, particularly computer scientists, by training and by inclination, are prejudiced against it. Programmers are trained to think serial and to work within the limitations of sequential processing. Electronics engineers, by contrast, should be more open, since when they design a chip or an application, they are designing an inherently parallel system, so they are trained to think parallel.

ETJ: You keep saying that the people who say that developing parallel systems is hard are prejudiced and don’t understand. Doesn’t that make everyone out of step except you?

IMB (grinning): That’s right – they are all wrong!

Forty years ago, Edsger W. Dijkstra wrote a key paper- Go To Statement Considered Harmful (Communications of the ACM, Vol. 11, No. 3, March 1968). What we need now is a new paper for parallel computing, Input, Output and Semaphores considered harmful. This would explain how, by formalising the communications between processes, parallel processing becomes easy.

ETJ: Finally: why, if the transputer was such a good idea, has it disappeared?

IMB: Firstly we were under-funded. We only had enough funding to develop the first transputer, and even then not enough for the programming languages. Although the transputer was a great success, it was not until the 1990s that the next generation T9000 design was launched, and by that time it was several years too late, since Intel had been producing a new generation of processor every two years.

Secondly, the communications between transputers were severely restricted by the silicon area in the first design, and this limited the configurability of applications. It wasn’t until the T9000 that this problem was resolved.

Finally we tried to do too much at the same time, with a radically new architecture, a radically new development environment, a radically new programming language, and even our own version of a PC for development.

But the ideas behind the transputer have not disappeared, even if some people have forgotten them. Watch this space.

Leave a Reply

featured blogs
Aug 19, 2018
Consumer demand for advanced driver assistance and infotainment features are on the rise, opening up a new market for advanced Automotive systems. Automotive Ethernet allows to support more complex computing needs with the use of an Ethernet-based network for connections betw...
Aug 18, 2018
Once upon a time, the Santa Clara Valley was called the Valley of Heart'€™s Delight; the main industry was growing prunes; and there were orchards filled with apricot and cherry trees all over the place. Then in 1955, a future Nobel Prize winner named William Shockley moved...
Aug 17, 2018
Samtec’s growing portfolio of high-performance Silicon-to-Silicon'„¢ Applications Solutions answer the design challenges of routing 56 Gbps signals through a system. However, finding the ideal solution in a single-click probably is an obstacle. Samtec last updated the...
Aug 16, 2018
All of the little details were squared up when the check-plots came out for "final" review. Those same preliminary files were shared with the fab and assembly units and, of course, the vendors have c...
Jul 30, 2018
As discussed in part 1 of this blog post, each instance of an Achronix Speedcore eFPGA in your ASIC or SoC design must be configured after the system powers up because Speedcore eFPGAs employ nonvolatile SRAM technology to store its configuration bits. The time required to pr...