How To Implement Virtual Memory, Part 3

We saw earlier how to use the MMU in your x86 processor (or almost any other modern processor) to space-shift your system’s memory. You can make memory appear to move around in the address space, you can make it magically appear where there isn’t any memory, and you can make it look like you’ve got more memory than you actually do, even making a tiny block of RAM look like a huge 4GB swath of memory. All neat tricks.

But you can also use the MMU to enforce privilege protection on your system memory, blocking access by certain programs. You can make your low-level operating system data structures off-limits, for example, or hide the existence of code ROMs, or make certain areas of RAM appear and disappear depending on who’s asking. This is all in addition to the x86 family’s usual methods of privilege protection, adding yet another tool in your toolbox.

Virtual memory information is broken into two parts, as we saw earlier. There’s one page directory with 1024 entries (called page directory entries, or PDEs), and each PDE points to a separate page table, which also has 1024 entries (called PTEs). That means you’ve nominally got 1025 tables residing in memory, although in practice you don’t need to create them all. You might be able to get by with only a few.

Each PDE and PTE is exactly 32 bits long, and their encoding is almost identical. The least significant bit is the Present bit, and it’s used to trigger a Page Fault, as we saw earlier. The Present bit applies to everything and everybody; it’s not dependent on privilege levels. But the next two bits in the PDE and PTE can be used to fine-tune memory access based on privilege.

Bit 2 is the User/Supervisor bit, and it’s a bit more nuanced. The name is old-fashioned, but the concept is sound. If the U/S bit in a PDE or a PTE is 1, the relevant area of memory is accessible to everyone regardless of privilege. In other words, it has no effect. Let’s treat U/S = 1 as the default case. But if U/S = 0, access is denied to code running at privilege level 3 (the lowest level). Any code running at privilege levels 0, 1, or 2 is unaffected and can access that area normally, but underprivileged code is locked out. It generates a Page Fault (exception 14) if it tries.

This is the “keep out the riffraff” bit. You can use it in your page directory and/or page tables to prevent the least-trusted code from reading, writing, or even learning of the existence of, certain areas of the physical address map. You might “disappear” certain areas of ROM, for example, or blank off sections of RAM.

Note that the U/S bit is applied only after the x86 processor’s usual checks for privilege. All the normal rules about descriptor privilege level (DPL) will be applied first, and if those fail, the U/S bit is irrelevant. It’s only if your code passes the normal segment-level checks that the additional MMU-level check comes into play. The U/S bit is your last line of defense against naughty code.

Bit 1 is the Read/Write bit, and you can probably guess its function. But it’s also trickier than it appears. If the U/S bit is clear – that is, if access is denied to lowest-privilege code – then R/W is ignored. Makes sense: If you’re already denying user/supervisor permissions, it doesn’t matter whether you deny read/write permissions in addition.

When U/S = 1, however, the R/W bit comes into play. Setting R/W = 1 allows both read and write access to privilege level 3 code. If R/W = 0, then code at privilege level 3 can read from the relevant area of memory but not write to it. Supervisor-level code (privilege levels 0, 1, and 2) are unaffected. This setting allows you to grant read-only access to sensitive areas of RAM, while also guaranteeing that untrusted programs can’t accidentally (or maliciously) write to that area, no matter how hard they try.

As before, these protections are applied only after the usual segment-level privilege checks, so your code would have to jump through those hoops first before the R/W bit is checked. In other words, the U/S and R/W protection bits only tighten the restrictions already in place. They can’t loosen or relax permissions in any way.

It’s also worth pointing out that the U/S and R/W settings in a PDE outrank those in a PTE. If a PDE has U/S = 0, then access is denied to all 1024 PTEs in the page table it points to. It doesn’t matter if any of those PTEs have their permissions set differently or more leniently. The “parent” table entry has already overruled all of its “child” table entries. Again, you can only tighten restrictions, not loosen them.

Finally, when you set or clear either the U/S bit or the R/W bit in a PDE (in the page directory), you’re affecting an entire 4MB block of physical address space. Setting or clearing the same bits in a page table entry (PTE) affects just a single 4KB block.

And, since these are always physical addresses, not linear or logical addresses, they apply to actual, physical memory devices in your address space. Denying access to a certain ROM, for example, works the same no matter how that ROM appears in the global or local descriptor tables (GDT or LDT), or how many times it’s aliased, or how its apparent address might shift around, or how you implement address translation. You’re locking out a hardware-defined range of addresses, not a relocatable program’s idea of where that ROM can be addressed. Fortunately, these restrictions apply only to the lowest privilege level, so you shouldn’t be able to lock yourself out of your own system memory. Maybe.

How To Implement Virtual Memory, Part 3

Related

Leave a Reply Cancel reply

featured video

How NV5, NVIDIA, and Cadence Collaboration Optimizes Data Center Efficiency, Performance, and Reliability

featured chalk talk