Many years ago, I said that there are two kinds of programmers: those who admit they hate x86 processors, and liars. Over the intervening years, my attitude toward x86 chips has softened quite a bit, to the point where I actually like them now. I think the change of heart is because the chips themselves got better, not because I’ve gone all soft in the head. But you can decide for yourself.
If you haven’t used an embedded x86 chip in anger for the past 10 years or so, you might be in for a treat. They’re a lot easier to program than you may remember, and they’re pretty useful for embedded systems. Granted, Intel still treats the embedded market as a distant cousin compared to its PC business, but as that latter market slowly evaporates, the embedded chips rise in relative status. That means you and I are more interesting to the Intel executive staff than we used to be. If things keep going this way, we might get invited to the company Christmas party in a few years.
One of the things I used to hate about early x86 processors was their memory segmentation. Other 32-bit chips had flat memory models, with one smooth, unbroken address range from 000000 all the way to 0xFFFFFF. I could put physical memory and I/O anywhere I wanted it in the address map, and I could address it without any strife or struggle. In contrast, early x86 chips made me jump through oddly sized hoops to reach anything, and they used this weird two-register segment+offset address model that didn’t have much correlation with the way I wanted to organize memory.
No more. Now segmentation is one of my favorite features – because it doesn’t work anything at all like it used to. Now it’s segmentation in name only. What it really is, is a cool debugging and security feature, and I’ve found it very useful. Modern segmentation has helped me trap all sorts of software errors, and it’s kept my code from running off into the weeds like a Mendocino pot grower.
Segmentation has a couple of nifty features, all of which can help you either find bugs or minimize their effects when they happen anyway. First off, modern x86 processors define three types of memory segments: code, data, and stack. That means you get to (have to, actually) identify every range of memory that you use as one of those types. That doesn’t mean you have to define the entire address space of the processor; just the address ranges that you actually use. In essence, you’re telling the chip, “this stuff here is code, this over here is data, and my stack lives over there.”
You can have as many of these segments as you want; you’re not limited to one code segment, for example, or a single data or stack segment. In fact, go nuts. Define a whole bunch of code segments, and so on. We’ll see why in a minute.
The first benefit of this identification process is that the chip will know what parts of memory are okay to execute (i.e., code segments), which parts are okay to write data into (data and maybe stack segments), and which parts should be addressed backwards and are okay for pushing parameters (stack segments). That means an x86 processor will never mistakenly execute your data, even if you screw up a jump or subroutine call. The chip “knows” that certain address ranges contain data or stack, not code, and will simply refuse to jump there. So right off the bat, you’ve got yourself a handy bug-containment feature.
It works the other way around, too. The chip will write data only into data segments, never code segments. It won’t even write into a stack segment unless you’re doing a stack operation, like pushing parameters. That means you won’t accidentally write over the top of your code with a mismatched pointer, botched dereference, or runaway index. The chip just won’t do it. “Nope, that there’s code, and I ain’t writin’ nothin’ in there,” you can almost hear it say.
Ah-hah, but what if I actually want to have self-modifying code? How can I update my code if I can never write into code space? No problem: you simply define a data segment that covers the same address range as the code. Modern x86 chips have no problem with overlapping segments, so it’s okay to dual-define some addresses as both code and data. For safety (and your own sanity), you should keep the overlap as small as possible to avoid exposing more code than necessary. And since you can define and delete segments on the fly at run-time, you can create the overlapping data segment just before you need it, and then delete that definition right after you’re done with the update. The rest of the time, your code space is execute-only, and there’s no way to accidentally overwrite it.
Segments can be any size you like, from a single byte to the whole universe of addresses. And you can have as many as you want. You can combine these two features to your advantage to make your system more secure and easier to debug. If you’ve got code scattered over several ranges (a ROM here, some diagnostic code there, and the operating system here, here, and here…) then the best approach is to define code segments that snugly encompass each chunk of code – no bigger and no smaller. That way, you can never run “off the edge” of the code segment and into undefined memory or data/stack space. The chip knows where the code ends, and it simply won’t fetch or execute anything outside of those boundaries. You can also scatter multiple data spaces around the memory map, and the chip will prevent stray writes to anything outside of those areas. Enclosing your stack(s) within tightly defined boundaries also prevents nasty stack underruns, which can be difficult bugs to locate any other way.
Old x86 chips used to “wrap around” the end of a segment, which a few programmers used as a cheap loop-counter trick. Mostly it was a bug, however. Modern x86 chips won’t wrap around the end of a segment anymore, but instead trap it as the programming error it usually is.
Memory segmentation is pretty handy, both for defining what parts of memory you’re using, but also what parts you aren’t. Most systems have a bunch of holes in their address map, where the vast majority of addresses aren’t used at all (who has terabytes of RAM in their embedded system?). Chances are, 90-some percent of addresses are invalid, so having a chip that catches and traps these is a nice bonus. Over the years, memory segmentation has gone from being a nuisance to becoming a real benefit. I may eventually get used to the x86 architecture after all.
Back in 1983, my group was developing a realtime multichannel measurement system that did a lot of calculation on each channel’s data, so the data area was almost all per-channel rather than common to all channels. The software was written in assembler to run on an 8086. Rather than indexing each variable in the data area by channel number, our software developer Phil Kane gave each channel its own data-segment value, so the indexing by channel number was done in 8086 hardware. Phil named the data area “virram”, for Virtual RAM.