feature article
Subscribe Now

Parallel Processing Considered Not Harmful

Parallel processing has long been held to be the way to achieve high processing throughput at a reasonable cost. Yet there are still few generally available systems, and it is seen as being difficult to do. European Editor Dick Selwood spoke to the pioneer of parallel processing, Iann Barron, about the issues.

Thirty years ago Iann Barron was one of the founders of the British semiconductor company, INMOS. The company was set up for several reasons, but one of them was to commercialise parallel processing, using a specialised device called the transputer.  Inmos has now disappeared, the remains of its technology being held by ST, but its legacy is still around in many different companies and in a general acceptance of parallel processing.

In an ideal parallel processing environment, tasks are spread across an array of processors, and adding more processors increases the performance. For some applications, the array can carry out the same instruction at the same time (Single Instruction, Multiple Data – SIMD), rather like soldiers drilling. However, most applications in the real world require Multiple Instructions, Multiple Data (MIMD), where each processor is carrying out an individual task and communicating with other processors as needed. It is this that is the target of much work on parallelism, and it is implementing software to run on these systems that is seen as the barrier to widespread adoption of parallel processing systems.

Iann Barron has been thinking about parallel systems for over thirty years and has published widely. Embedded Technology Journal spoke to him about the current issues.

ETJ: Why is parallel processing important?

Iann Barron (IMB): Existing techniques to improve processor performance have reached or passed their limit (a processor’s silicon is already mostly cache RAM), and yet the technology allows even larger processor chips to be built. Parallel processing is the way that performance can continue to improve.

ETJ: How far have people moved on in adopting parallelism in the last thirty years?

IMB: They haven’t: if anything they are behind where we were thirty years ago.

ETJ: But we have seen things like Intel’s dual core entering the main stream.

IMB: Yes, but Intel has hit the wall with the quad core. All they are doing is multi-threading on several processors rather than just one; there isn’t enough inherent parallelism in Windows and current user applications to keep even four processors usefully busy.

The inexorable increase in the capability of silicon means that we can build ever larger processors; however, the current techniques for increasing the performance of a processor long ago ran into the buffers (already the silicon area of processor is mostly memory), and now no one has any good idea what to do, except to increase the number of processors on a chip.

Now that is a good idea, except that to use more processors means rewriting applications, and that makes Intel vulnerable unless they have a very clever strategy to move the market into multi-processor applications.

ETJ: Is the Intel Quad Xeon a good model for future parallel processing?

IMB: The Xeon is an impossibly over-complicated processor, which is almost certainly not an effective use of silicon. Simpler processors, which are properly designed to support parallel processing, will prove more cost effective, and if Intel does not produce them, one of its competitors will – and that is another strategic problem for Intel.

ETJ: What does it take to design a proper parallel processor?

IMB: The key thing is that there must be a close correspondence between the hardware model of the processor and the software model of a process – the basic parallel task unit. In particular, the communications model between processors needs to correspond to the communications model between processes.

This is because it will be necessary to map the same application onto different parallel configurations (how many processors will there be in your pc?), so the computational overhead caused by mapping a set of processes (the application) into a specific set of processors (the computer) must be kept low.

The transputer got this exactly right. A single sequential processor could support many processes, being able to swap between processes in the equivalent of 10 instructions, while the communication between processes was identical, whether they were running on the same processor or different processors.

ETJ: So all we need is a new processor designed for parallel processing?

IMB: No – you have to understand that you can’t take existing programs and run them through a software black box to make them into parallel programs. The act of creating a serial program means that the information about the parallel aspects of an application has been lost, and there is no easy way to get it back. So today’s software is largely incompatible with future parallel processing.

ETJ: What applications lend themselves to parallelism?

IMB: The obvious class is graphics. Indeed the latest graphics chips already support some parallel operations. As a result, some of today’s supercomputers are built from graphics chips, even though they might not be ideal.

A lot of heavy scientific applications also have easily exploitable parallelism, because they are array based.

And, of course, operating systems are a natural for parallelising, since an operating system runs lots of activities that should, ideally, be taking place at the same time, instead of competing for processing time on a single processor.

But beyond that, most applications are not readily mappable onto a regular array.

ETJ: In that case, how do you do it?

IMB: After writing the application with explicit parallelism, you need to run a new type of program – a mapper – that maps the program onto a specific set of processors, analyses the interactions, and then optimises the mapping for that particular configuration. The mapper must do this dynamically, without too much cost in processing and execution time. In this way you can write the program and only then worry about the mapping into the hardware.

ETJ: Does each application have to be linked to specific hardware?

IMB: No, the mapping program provides the link. What is necessary is that the mapping does not introduce undue software overhead, which is why I said that there must be a close correspondence between the hardware and software models for a process.

ETJ: Should we expect new languages for parallel applications?

IMB: The sensible approach is to extend the existing languages with parallel constructs. There is too much invested in software to be able to re-write applications in an entirely new language. Adobe, for example, doesn’t want to completely re-write Photoshop to exploit parallelism. Instead it should be possible to write new segments that take advantage of parallelism while still being able to communicate with existing parts of the program that are still sequential. In this way, an application can be transformed into a parallel version over several market releases. What this means is adding parallel constructs to existing languages, such as C or Java.

This approach also has the advantage that, by adding parallelism to an existing language, there is a large resource of programmers who understand that language and need to learn only the new constructs or instructions.

ETJ: If it is as simple as you suggest, why are people still saying that exploiting parallelism is difficult?

IMB: It is mainly because people haven’t thought through the issues properly.

Parallelism isn’t difficult, it is just that people, particularly computer scientists, by training and by inclination, are prejudiced against it. Programmers are trained to think serial and to work within the limitations of sequential processing. Electronics engineers, by contrast, should be more open, since when they design a chip or an application, they are designing an inherently parallel system, so they are trained to think parallel.

ETJ: You keep saying that the people who say that developing parallel systems is hard are prejudiced and don’t understand. Doesn’t that make everyone out of step except you?

IMB (grinning): That’s right – they are all wrong!

Forty years ago, Edsger W. Dijkstra wrote a key paper- Go To Statement Considered Harmful (Communications of the ACM, Vol. 11, No. 3, March 1968). What we need now is a new paper for parallel computing, Input, Output and Semaphores considered harmful. This would explain how, by formalising the communications between processes, parallel processing becomes easy.

ETJ: Finally: why, if the transputer was such a good idea, has it disappeared?

IMB: Firstly we were under-funded. We only had enough funding to develop the first transputer, and even then not enough for the programming languages. Although the transputer was a great success, it was not until the 1990s that the next generation T9000 design was launched, and by that time it was several years too late, since Intel had been producing a new generation of processor every two years.

Secondly, the communications between transputers were severely restricted by the silicon area in the first design, and this limited the configurability of applications. It wasn’t until the T9000 that this problem was resolved.

Finally we tried to do too much at the same time, with a radically new architecture, a radically new development environment, a radically new programming language, and even our own version of a PC for development.

But the ideas behind the transputer have not disappeared, even if some people have forgotten them. Watch this space.

Leave a Reply

featured blogs
Sep 18, 2021
Projects with a steampunk look-and-feel incorporate retro-futuristic technology and aesthetics inspired by 19th-century industrial steam-powered machinery....
Sep 17, 2021
Dear BoardSurfers, I want to unapologetically hijack the normal news and exciting feature information that you are accustomed to reading about in the world of PCB Design blogs to eagerly let you know... [[ Click on the title to access the full blog on the Cadence Community s...
Sep 15, 2021
Learn how chiplets form the basis of multi-die HPC processor architectures, fueling modern HPC applications and scaling performance & power beyond Moore's Law. The post What's Driving the Demand for Chiplets? appeared first on From Silicon To Software....
Aug 5, 2021
Megh Computing's Video Analytics Solution (VAS) portfolio implements a flexible and scalable video analytics pipeline consisting of the following elements: Video Ingestion Video Transformation Object Detection and Inference Video Analytics Visualization   Because Megh's ...

featured video

Maxim Integrated is now part of Analog Devices

Sponsored by Maxim Integrated (now part of Analog Devices)

What if we didn’t wait around for the amazing inventions of tomorrow – and got busy creating them today?

See What If: analog.com/Maxim

featured paper

An Engineer's Guide to Designing with Precision Amplifiers

Sponsored by Texas Instruments

This e-book contains years of circuit design recommendations and insights from Texas Instruments industry experts and covers many common topics and questions you may encounter while designing with precision amplifiers.

Click to read more

featured chalk talk

Benefits and Applications of Immersion Cooling

Sponsored by Samtec

For truly high-performance systems, liquid immersion cooling is often the best solution. But, jumping into immersion cooling requires careful consideration of elements such as connectors. In this episode of Chalk Talk, Amelia Dalton chats with Brian Niehoff of Samtec about connector solutions for immersion-cooled applications.

Click here for more information about Samtec immersion cooling solutions