In the beginning was the operating system (OS). And the OS begat processes and the processes begat threads, and all thrived under this benevolent hierarchy. For a thread, there was nothing outside the domain of the process. And for a process, there was nothing outside the domain of the OS. Protected within the bosom of their parents, these child entities had no need to interact with other children from outside this universe. Lack of a broader view was balanced by protection from possible predations.
And so life continued, from the top down. Such an approach molded the evolution of object orientation. With some exceptions (like being able to refer to a parent), an object can see only its properties and methods, along with anything downstream of them.
But I can’t be the only one that has sometimes wished that an object could also sample the context of the world outside it to help to drive decisions. Top-down isn’t always sufficient, although it is neat and orderly, so it’s hard to promote potential chaos as a substitute.
All of that said, today’s heterogeneous multicore systems feel far less biblical. Modern systems force us to distinguish multiple religions, versions of the Bible and alternatives to the Bible. Different languages, different histories, different cultures. Most importantly, there’s no single deity-in-charge. In other words, we’re talking about multiple processors having different OSes that operate in isolation from each other.
This separate universe thing is fine if a process from one device is managing enterprise human resources and one from another processor is watching the surf forecast. They mostly don’t have a reason to interact (unless perhaps you’re wondering why Bob hasn’t shown up the last few days after that big storm…)
But, increasingly, we’ve needed to find ways to let processes from different OSes interact with each other. COM was one way of doing that, although it was (and is) very computer-oriented and, in practice, it mostly promoted interaction between processes under the same OS instance. It was not so much intended for resource-poor embedded systems.
The classic heterogeneous asymmetric multiprocessing (AMP) system would be the smartphone. This has worked because, by and large, the different processors do entirely different things. One handles HR stuff while the other watches the surf report. Oh wait, no: one might handle baseband processing while another does graphics, all alongside the mighty application processor.
Each has its OS, but there’s really not much reason for these devices to interact except via basic commands exchanging instructions and data. That’s seen changing as AMP systems try to increase their sophistication by letting these cores not only talk to each other, but even to manage each other.
Image courtesy Multicore Association
So Mentor and Xilinx picked up some capabilities from Linux – things like rproc for remote procedure calling, virtio for device sharing, and rpmsg for inter-processor chats – functions that existed but might have been limited to use within a single Linux instance – and extended them to reach beyond the parent OS. They called it OpenAMP, and it’s in the Linux distribution now. But they’ve called in the Multicore Association (MCA) to standardize, formalize, document, and extend this capability.
A few years ago, the MCA made available some inter-process communication help in the form of MCAPI. It’s a mechanism that allows a direct discussion between parts of different processes. It’s been joined by MRAPI, which lets you manage resources outside your process, and MTAPI for managing tasks (“task” being an abstraction of process or thread or other executable entity).
These tools promoted flexibility in crossing process boundaries, but they’re low-level and somewhat fiddly – the kind of thing you can see being leveraged by architects and programmers that are suspicious of abstraction. Most importantly, however, they provide no mechanism for controlling the actual OS. That’s where OpenAMP is different. It lets programmers manage other OS entities within the system – including launching and killing them. A master core can bring down a core running one OS stack and restart it with a completely different stack, for example.
By standardizing OpenAMP, the MCA hopes to ensure that different implementations of OpenAMP work the same way – ideally, that you could swap one for another and not have any system changes. For example, along with the announcement of the standardization effort, they announced offerings from Mentor, Micrium, and Xilinx.
This ecumenical philosophy goes only so far, however. One of their guiding principles is for “business-friendly APIs” that allow for “proprietary solutions.” The Xilinx implementation, for example, may include things that aren’t in other implementations and that are outside what’s been specifically standardized.
So the “swapping and it still works” thing applies in a limited sense: if, as a programmer, you stick to the standard APIs, then they’ll all work consistently and you can indeed swap things around. But if you include proprietary calls in your code, then you won’t be able to switch. The value of the standard in this case is that any calls to the standardized API will work identically to similar calls from other vendors.
I asked whether this can work across chips or if it’s restricted to within a single SoC or FPGA. As I suspected, trying to move outside a single chip is not at all efficient. If you want to manage the cores within another processor chip, you’d need physical wires of some sort (along with a protocol) to enable the communication between the chips. Much easier to implement within an FPGA or using an SoC’s network-on-chip.
Likewise with messaging. Here the communication is often implemented through mailboxes in shared memory. And that memory is typically internal to the chip. If external memory were used, then, in theory, you could send a message for retrieval by another chip. But you’d still need a communication mechanism for sending the “doorbell” alert letting the other chip know that it needed to read the mailbox (unless you want it to poll, which expends precious clock cycles).
In other words, it would be very inefficient to try to cross chip boundaries with this. While they don’t say you can’t, it’s clear that this isn’t really intended.
They’re looking at biennial upgrades, with V0 being available shortly. Documentation will likely follow in a few months.