feature article
Subscribe Now

Multi-… what?

In Search of a New Buzzword

We are now in the parallel era. More and more of the machines we use do more than one thing at a time.

Okay, some of them are capable of doing more than one thing at a time, even if the software that runs on them doesn’t know how to take advantage of that.

Problem is, there are many ways to create systems that run things in parallel. Most are very specific and have specific names associated with them. But there are characteristics that all of these techniques share – and yet there doesn’t seem to be a good way to refer to the bigger picture in a clear, concise, and unambiguous way.

Let’s start by reviewing all the ways you can run things concurrently, examining their ins and outs. We can then try to abstract up a level to see what we should call that. OK, truth in advertising – [spoiler alert] actually, I’m not going to have a suggestion; I’m going to punt and ask for help.

1.  Multicore – SMP

The term “multicore” typically refers to a hardware architecture where a processor has more than one execution engine. But there are actually other ways people use “multicore”, so let’s start with the simplest of them all, “symmetric multicore processing,” or SMP. Of all the parallel techniques, this is probably the easiest because someone else (in the guise of an operating system) does the work for you.

With SMP, all cores are the same and their environments all look the same. In other words, an operating system can’t tell them apart. The good thing about that is that the OS can simply schedule tasks on whichever cores are available. It doesn’t have to know anything about individual idiosyncrasies that a core might have – because none of them have any.

Your PC is probably the best example of SMP. Wait… maybe “best” isn’t appropriate. How about, “most obvious” or “simplest.”

2. Multi-processing

OK, I tossed this one in here because it’s not much different from SMP in some of its interpretations. The difference here is that, for some people, “multi-processor” means multiple chips with processors in them. In fact, you could have multi-processor and multicore in the same system: dual quad-core is exactly such a situation.

Does it matter whether it’s a chip or a core that’s multi? Well, there are some technical reasons – the chips are more loosely coupled than the cores. But an OS doesn’t care if the cores all look alike; this can also operate under the SMP moniker.

Of course, some people use the term – and, in particular, the gerund “multi-processing” – as a more generic term, but it’s too easily conflated with the more narrow “multi-processor” meaning.

3. Multi-threading

This isn’t an architecture; it’s the concept of taking a process and breaking it into sub-processes called threads. If no one had used this term for this specific thing, it would be a great metaphoric generic term. Alas, too late.

4. Multicore – AMP

From a hardware standpoint, this can be the same as SMP above. The difference is in the OS. Instead of a single OS managing a bundle of processors or cores identically, different cores may have different OSes. Or different instances of the same OS. Or no OS. One simple configuration is to have one core that runs Linux, for example, and manages the other cores by assigning them processes that run “on bare metal.”

One implication of this is the nature of what is running on a core. With SMP, the OS takes a process and schedules threads on the cores. With AMP, anything assigned to a core with its own OS is a process (or, if no OS, is simply a program).

You can also mix SMP and AMP; for example, with eight cores, four could be managed as SMP and assign tasks to the other four on bare metal. Of course, as a system designer, you’ve got to figure out how to make all of that work.

5. Heterogeneous (multicore)

Here the hardware is different for different cores. It could be as simple as having the same core but different memory or peripherals, but, more commonly, the cores themselves are different. Cell phone SoCs are a good example of this, running many different cores for many different tasks, and each core is chosen to optimize its performance on its assigned task.

Other common configurations are host microprocessors working in tandem with worker DSP chips or, increasingly, GPUs.

This also pretty much means you have to look like AMP from an OS standpoint since you don’t have uniform resources.

6. Grid computing

All of the configurations we’ve looked at so far tend to suggest a single computer or even a single chip. Communication between cores can be by lighter-weight transport schemes, or, at least, ones designed to stay in the same county. PCI-Express is an obvious example, but even just an AMBA bus might suffice in an SMP setup with shared memory. For managing the inter-process communication on something more complex than SMP, MCAPI can be used.

But with grid computing, the implication is of different boxes that might lie on different continents. The transport between them is Ethernet or the internet.

Of course, each box in this case runs its own OS; it’s AMP by definition. Inter-process communication is managed by a heavier-weight protocol, MPI, which can handle some of the more complex implications of computers that may come and go from the grid.

7. Many-core or massively parallel computing

“Many-core” is a nuance intended to distinguish the vast majority of simple multicore chips, which will usually have at most four or so cores, from ones that go much further than that, like Tilera’s 64-core chip. It’s harder to deal with many-core chips because of the challenges managing communication, I/O, and memory effectively, so things that work with simple SMP might well fail on a many-core chip.

“Massively parallel” is a term that, more or less, refers to the same thing. But it goes back much further, having referred to such boxes as the Thinking Machines beast of yore. So it has something of a dated feel to it, even though marketers like to pull it out if they think they can wow you with it. Just cuz it sounds cooler.

8. Parallel programming

I include this because “parallel” is an obvious candidate term for all the stuff we’re talking about. But it’s too focused on the “programming” part, suggesting some of the more arcane techniques being dreamt up in universities to make creation of parallel programs easier. Maybe a new language, maybe a new paradigm, maybe new tools.

9. Hardware acceleration

Here we step out of the realm of more obvious instances of parallel computing. And, in fact, sometimes this isn’t parallel at all.

The idea with hardware acceleration is that some task takes too long to do in software, so you assign it to faster hardware. There are two ways to do this: an easy way and a harder way.

The easy way is to have the processor encounter the need for an accelerator in a program and to fire off the accelerator and wait for the response. This is a so-called “blocking” configuration because the program that the processor is running has to stop and wait for the accelerator to return its result – progress is blocked. Yeah, it seems wasteful to have the processor sit around, but it’s still faster than doing the computation in software. And, if there’s more than one process or thread that needs to be handled, the OS can swap contexts while the accelerator is running, so, except for context-switching overhead, it doesn’t have to be too wasteful.

The harder way is to invoke an accelerator in a non-blocking manner. Now if the accelerator doesn’t have to return anything to the processor – it just does something (say, packages up a TCP packet and sends it along) and then comes back looking for more work, then this is trivial. The hard part is when you need the result of the accelerator, but you want the processor to keep going while the accelerator works. This is more or less a fork/join configuration where execution forks, one part continuing on the processor, the other going to the accelerator, with a subsequent join where the results of the two threads come together.

Even if you have only a single core doing the computing, this is parallel execution. But it doesn’t qualify as “multicore” because there’s only one core. But when an accelerator is operating on a non-blocking basis, all of the considerations required for deciding how to create threads for two cores are still required. It’s just that no one thinks of it as multicore.

In fact, this whole search for an uber-term might not be so hard if it weren’t for this one stickler. Say “multicore” and people won’t think of hardware acceleration. Say, “hardware acceleration” and people won’t think of the kinds of things you have to consider for multicore. And yet, in the abstract, they’re really two flavors of the same thing.

So here I’ve identified nine different ways of referring to various aspects of the act of doing more than one thing at a time. Each term is either specific or overloaded (and, hence, ambiguous).

So your assignment is to come up with a term that can be used to refer to any of the above generically. We’re rising up a level of abstraction here, trying to find a new meta-term. Perhaps you think one of the above terms does the trick. (If someone else doesn’t agree, then, by definition, you’re wrong.) More likely, we have to resort to something new. Which means that someone would have to launch it and drive it home.

So… what’s it going to be? Multi-tasking? (Hmmm… that suggests the rough single-parent life or the life-on-the-edge of the commuter with a cell phone in one hand, an iPad in the other hand, and a cup of coffee in the other hand.) Concurrent computing? (Some people don’t think of hardware execution as “computing” because it’s not done by a “computer.”) Concurrent execution? (Sounds like something out of a Sergio Leone movie.)

I’m stuck. You gotta help me out on this one.

Leave a Reply

featured blogs
Nov 24, 2020
In our last Knowledge Booster Blog , we introduced you to some tips and tricks for the optimal use of the Virtuoso ADE Product Suite . W e are now happy to present you with some further news from our... [[ Click on the title to access the full blog on the Cadence Community s...
Nov 23, 2020
It'€™s been a long time since I performed Karnaugh map minimizations by hand. As a result, on my first pass, I missed a couple of obvious optimizations....
Nov 23, 2020
Readers of the Samtec blog know we are always talking about next-gen speed. Current channels rates are running at 56 Gbps PAM4. However, system designers are starting to look at 112 Gbps PAM4 data rates. Intuition would say that bleeding edge data rates like 112 Gbps PAM4 onl...
Nov 20, 2020
[From the last episode: We looked at neuromorphic machine learning, which is intended to act more like the brain does.] Our last topic to cover on learning (ML) is about training. We talked about supervised learning, which means we'€™re training a model based on a bunch of ...

featured video

Product Update: Broad Portfolio of DesignWare IP for Mobile SoCs

Sponsored by Synopsys

Get the latest update on DesignWare IP® for mobile SoCs, including MIPI C-PHY/D-PHY, USB 3.1, and UFS, which provide the necessary throughput, bandwidth, and efficiency for today’s advanced mobile SoCs.

Click here for more information about DesignWare IP for 5G Mobile

Featured paper

Top 9 design questions about digital isolators

Sponsored by Texas Instruments

Looking for more information about digital isolators? We’re here to help. Based on TI E2E™ support forum feedback, we compiled a list of the most frequently asked questions about digital isolator design challenges. This article covers questions such as, “What is the logic state of a digital isolator with no input signal?”, and “Can you leave unused channel pins on a digital isolator floating?”

Click here to download the whitepaper

Featured Chalk Talk

Nano Pulse Control Clears Issues in the Automotive and Industrial Markets

Sponsored by Mouser Electronics and ROHM Semiconductor

In EV and industrial applications, converting from high voltages on the power side to low voltages on the electronics side poses a big challenge. In order to convert big voltage drops efficiently, you need very narrow pulse widths. In this episode of Chalk Talk, Amelia Dalton chats with Satya Dixit from ROHM about new Nano Pulse Control technology that changes the game in DC to DC conversion.

More information about ROHM Semiconductor BD9V10xMUF Buck Converters