feature article
Subscribe Now

How to Handle x86 Inter-Task Communication

Keeping Your Tasks Separate, But Not Too Separate

All modern x86 processors can handle task switching automatically in hardware. That’s one of their nice features. That doesn’t prevent you from coding up your own custom tasking mechanism – Microsoft Windows does – but unless you’re aiming for some specific implementation, there’s no reason to ignore the built-in version. 

Most of the time, you want your tasks to be separate and independent of one another. They shouldn’t interfere, communicate, affect, or otherwise molest any other tasks. That’s usually the point. If one task crashes, it won’t take any other tasks down with it. If one task is slow or buggy or hangs, it won’t affect any of its neighbors. So far, so good. 

But there are times when we do want tasks to interact in some way, usually by sharing data. Maybe one task collects data and feeds it to another task. Maybe you’ve divided up your steam whistle controller app into an elaborate state machine. Whatever. Sharing data between tasks requires breaking down some of the barriers that x86 hardware erects to keep tasks safe and secure. We’re not exactly bypassing the built-in safety measures. We’re just using them judiciously. 

Memory Overlap

The only way to communicate between tasks is to share data, and the only way to share data is to have overlapping memory spaces. Normally, we’d keep each task’s memory separate and apart from every other task’s memory. That’s how we prevent them from interfering with each other. But, to communicate, they’re going to have to find some middle ground; a shared space they both can access. 

They don’t have to share their entire memory space. They don’t even have to share very much of it. It could be as little as one single byte that you use for some sort of flag or semaphore. Or they could share hundreds of megabytes. The size of the shared space is up to you, but the methods are the same either way. 

One way to implement the shared space is to have two data segments that overlap. Let’s say that Task A has a 1MB data space from address 0x0000_0000 up to 0x000F_FFFF, and Task B also has a 1MB data space, but it starts at 0x0008_0000. They overlap by half, so the upper 512KB for Task A is the same as the lower 512KB for Task B. 

Any data that Task A writes into that 512KB shared area will be visible to Task B, and vice versa. On the other hand, the lower 512KB of that segment is “private” to Task A, even though it’s part of the same data segment. (Same goes for the upper 512KB at Task B.) This highlights one of the tricky problems with sharing data with overlapping segments — namely, that none of this is visible to software. There’s no graceful C language construct that would tell the compiler that half of a given data segment is somehow different from the other half and shared with another task. The hardware is happy to implement this, but you’ll have to figure out how to make the software workable. Oh, and be sure to declare all the variables in that space as volatile

A slightly more elegant solution is to define an extra data segment that’s only for sharing data and give that segment to both tasks. We’d modify the situation above to define a data segment starting at address 0x0008_0000 with a size of 512KB. Then both tasks would have the same start and end addresses, and they’d both use that segment for nothing else. One advantage to this approach is that it triggers hardware-defined boundary checks on the shared area. If your code tries to read or write beyond the upper or lower limits of the shared area, the processor will trap it. 

Obviously, you can shrink (or grow) the size of this shared segment to be as small (or as large) as you want. Even a single shared byte is doable. You also don’t have to limit this to just two tasks. Any task that has access to the shared data segment can be a participant. Make it a party! 

As a slight upgrade, you can give one task read/write permission to the shared data segment but restrict the other tasks to read-only access. That way, you’ve got one “broadcaster” and multiple “receivers.” Once again, the hardware will implement the access protections for you, so you don’t have to worry about adding code to prevent the wrong task writing to the shared space when it shouldn’t. 

Segment Access

Which brings us to another issue. Sharing a data segment means sharing that segment’s descriptor in either the global descriptor table (GDT) or a local descriptor table (LDT). Using the GDT is easier, because every task has access to it, by definition. But that can be either good or bad. If the shared descriptor is in the GDT, every task will have the same access rights as every other task – same base address, same ending address, same read/write permissions, and same privilege level. You can’t implement the broadcaster/receiver setup if the shared segment is in the GDT. 

To do that, you’ll need to be sure that the segment descriptor for the shared segment is not in the GDT and put copies in each LDT instead. Assuming that each task has its own LDT, and that tasks don’t all share a single LDT, you can drop a copy of the descriptor into the LDT of each task that needs one. Those descriptors can be slightly different – in fact, they should be different – so that some tasks get read/write permission and some don’t. All the other parameters, like base address, size, privilege level, etc., should probably be identical across tasks to avoid confusion.  

Alternatively, if your tasks share a single LDT (which is perfectly acceptable), you’ll need just one copy of the shared-segment descriptor in that single, shared LDT. But if you do that, all tasks will have the same access rights. 

On the other other hand, if you’ve configured your task management so that some tasks share an LDT and some don’t, those with the communal LDT will have access to the shared data segment (including identical read/write permissions), while those tasks with their own LDT might not have access to the shared data segment at all. 

For yet another twist, privilege levels might also enter into the equation. Recall that data segments have a privilege level assigned to them, from DPL0 to DPL3. So do code segments, from CPL0 to CPL3. Code cannot access data that’s more privileged than it is, even if the data segment descriptor is otherwise visible in the GDT or the local LDT. Privilege rules always apply. So, some of your tasks might be able to access the shared data and some might not, depending on the privilege level of the code that’s running within the task. Isn’t configurability great? 

Corner Cases

There are plenty of ways to screw this up, mostly to do with aliasing. There are no limits on how you can define data segments. They can start at any address, have any arbitrary size, and have varying read, write, and privilege levels. It’s all up to you. That’s a recipe for flexibility, as well as for mischief. 

It’s easy – and fun! – to set up a data segment that’s an alias of another data segment. That means the two segments overlap 100%. They’re identical twins, clones, aliases. As an example, segment DS might point to the exact same range of memory as segment ES. But the hardware doesn’t know that, and neither does your compiler. If any of your tasks have the ability to create segment descriptors, then they also have the ability to screw up the carefully managed shared data spaces you’ve created above. 

An even tougher loophole involves the MMU. You can alias memory though the logical-to-physical mapping in the MMU, even if the software-visible addresses are completely different. This will be invisible to the processor and to your software. Even the most carefully crafted segment definitions, with correct read/write permissions and privilege levels, can be undermined by a poorly thought-out MMU configuration. For example, segments DS and ES might be completely different and unrelated segments with different addresses, but if the MMU maps those logical addresses to the same physical addresses, they’ll become aliases of each other anyway. Realistically, this should never be a problem because you wouldn’t let normal tasks manipulate the MMU, but it can lead to very tough bugs. Let’s not drive ourselves crazy. 

Oops, too late.

Leave a Reply

featured blogs
Jan 21, 2022
Here are a few teasers for what you'll find in this week's round-up of CFD news and notes. How AI can be trained to identify more objects than are in its learning dataset. Will GPUs really... [[ Click on the title to access the full blog on the Cadence Community si...
Jan 20, 2022
High performance computing continues to expand & evolve; our team shares their 2022 HPC predictions including new HPC applications and processor architectures. The post The Future of High-Performance Computing (HPC): Key Predictions for 2022 appeared first on From Silico...
Jan 20, 2022
As Josh Wardle famously said about his creation: "It's not trying to do anything shady with your data or your eyeballs ... It's just a game that's fun.'...

featured video

Synopsys & Samtec: Successful 112G PAM-4 System Interoperability

Sponsored by Synopsys

This Supercomputing Conference demo shows a seamless interoperability between Synopsys' DesignWare 112G Ethernet PHY IP and Samtec's NovaRay IO and cable assembly. The demo shows excellent performance, BER at 1e-08 and total insertion loss of 37dB. Synopsys and Samtec are enabling the industry with a complete 112G PAM-4 system, which is essential for high-performance computing.

Click here for more information about DesignWare Ethernet IP Solutions

featured paper

nanoPower Module Extends Battery Life in Space-Constrained Applications

Sponsored by Analog Devices

Designers can now increase battery life and reduce size in space-constrained IoT devices with a power module that features the lowest quiescent current compared to competitive solutions and uSLIC built-in inductor technology that reduces solution size by up to 37%.

Read Now

featured chalk talk

Tackling Automotive Software Cost and Complexity

Sponsored by Mouser Electronics and NXP Semiconductors

With the sheer amount of automotive software cost and complexity today, we need a way to maximize software reuse across our process platforms. In this episode of Chalk Talk, Amelia Dalton and Daniel Balser from NXP take a closer look at the software ecosystem for NXP’s S32K3 MCU. They investigate how real-time drivers, a comprehensive safety software platform, and high performance security system will help you tackle the cost and complexity of automotive software development.

Click here for more information about NXP Semiconductors S32K3 Automotive General Purpose MCUs