Memory safety enforcement using CHERI

Kulasko · Post by **Kulasko** » 2022-04-04, 0:32:32

I recently stumbled upon the CHERI project because there is an ongoing effort inside the Rust language to change pointers in order to support its memory model. It can also be supported by other languages such as C/C++ by imposing some restrictions, with the ultimate goal of greatly enhancing memory safety and enabling it to be formally proved, at least as far as I understood it.

https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/
This is the site to the project. It not only provides enhancements for software, but also for hardware as well, which makes it interesting in the context of ForwardCom.
On a basic level, it changes pointers to a structure with double the side of an address word called a capability. Additional fields include permission flags, an object type describing the memory pool it belongs to, and bounds in both directions. There's also an associated validity flag.
New capabilities can only be constructed using existing ones, whereas the new ones permissions can't exceed the ones of the capability that was used to construct it. Deallocations should be able to invalidate any capability pointing to its memory.

Supporting it would probably require substantial changes to the ForwardCom specification, but maybe the memory safety guarantees would be worth it. In any case, I think it is worth some investigation and discussion.

HubertLamontagne · Post by **HubertLamontagne** » 2022-04-12, 20:24:13

I gotta admit, I haven't seen other stuff like that yet. Interesting goal, tracking the allowed range of all memory addresses and if they're heap/stack/global data, and even object encapsulation. Not quite sure what to think of it, it reminds me of 16-bit x86's protected mode FAR pointers (and the historical iapx 432). Clearly they're into program formal verification as well.

Post by **agner** » 2022-04-13, 6:17:10

I am somewhat skeptical too. CHERI is proposing a very costly solution to a problem that can be solved in other ways.

I haven't looked too deeply into CHERI, but as I understand it, it requires a lot of changes in both hardware and software:

The hardware ISA must have an extra set of special 128-bit pointer registers in addition to existing register types. Add to this a new set of instructions for manipulating these registers, and extra addressing modes for dereferencing them
Cache and RAM must have extra "capability tags" that are cleared when a memory element containing a pointer is overwritten with a non-pointer.
Compilers, linkers, and other binary tools must be modified to support the new kind of pointers.
Operating systems must be modified.
Legacy software must be rewritten to strictly separate pointers from other variables. For example, Windows functions are routinely casting pointers to integers and back again. This would not be allowed in CHERI.

This is a very high cost to fix problems that are inherent to specific programming languages, in particular C and C++. A programmer who wants easy array bounds checking can simply use another programming language.

The problem of array bounds violation in C++ has already been solved by the use of container classes. You can make a container class with an overloaded [] operator that makes it behave just like an array, but with bounds checking. (Some of the standard C++ container classes are inefficient because they split a data structure into many separately allocated pieces, but I am routinely making my own container classes with contiguous memory). This solution is less safe if you are not using it systematically, but it is more flexible because you can tailor make your own container classes with whatever features you need. If, for example, you want data structures that can be sub-divided, then make a container class that supports this feature. Solutions can be published and reused by others.

ForwardCom has many extra security features that I believe solve a lot of common security problems:

Separate call stack and data stack. The call stack cannot be corrupted by buffer overrun or other errors.
Pointer tables are stored in read-only memory by default. Cannot be compromised
Executable code is stored in memory areas with no access to read and write
Access rights can be defined at the thread level. If you need a sandbox for safe handling of incoming data or code, you can handle it in a separate thread with access only to its own limited memory area
Drivers and system functions have carefully controlled access rights. An application function that calls a driver can give it access to only a specific memory area.

ForwardCom is an open instruction set that is useful for experiments like this. Anybody is allowed to make experimental versions with added features. But my personal opinion is that CHERI is a very expensive solution to a problem that can be solved in other ways. It will cost in performance because it makes hardware more complicated and consume more power and because there are more registers to save on context switches.

Kulasko · Post by **Kulasko** » 2022-05-11, 20:16:39

CHERI indeed requires a fair amount of changes, but as far as I have seen, it still looks reasonable.
I want to comment on a few points you made in your last post:

Most importantly, it can be implemented by extending the general purpose registers to 128 bit, instead of introducing another register set. This would also allow using the capability address as an integer, but not the other way around.
New instructions to clone a capability or further limit its scope or permissions would have to be introduced. However, the current addressing modes should work just fine, the only change being that the base pointer has to use a capability. In fact, capabilities automatically limit the offset that can be applied on them, potentially serving the same function as the current formats with an index limit.
Cache and RAM do need extra tagging, that is true.
A modified clang/LLVM/LDD toolchain has already been introduced, as well as a modified GDB and a FreeBSD version.
It is true that legacy software might not respect the rules used by CHERI, so not all existing C(++) software might run on a CHERI enabled hardware platform. However, I am not convinced that this a a breaking argument for a completely new ISA such as ForwardCom, because it has many parts potentially breaking legacy software already, such as an enforced ABI. Software like OpenSSH, WebKit and PostgresSQL have been shown to be successfully compiled to a CHERI enabled platform already.

The CHERI project's goal is not to provide a secure C(++) target, but to provide verifiable memory safety throughout the whole platform, including eventual bugs in compilers or runtimes, or purposely malicious code.

In total, I presented this to you because I want ForwardCom to "do it right" from the beginning. CHERI can provide a formally verified guarantee of zero memory safety attack vectors in the ISA specification, but does so at the cost of additional hardware and toolchain overhead. On the other hand, the currently specified ForwardCom features seems to eliminate a lot of attack vectors already, without incurring that overhead.
All that said, CHERI is sufficiently modular that the ForwardCom specification could simply be extended by it as it happened with ARMv8, should the need arise.

Something I forgot to mention in my initial post, CHERI enabled ARMv8 hardware has been shipping since January. The SoC uses high performance ARM N1 cores and did hit its targeted clock speed of 2.5 Ghz. It does seems like the overhead using CHERI is pretty negligible, but of course, it is also a big design with a lot of features, so a low relative overhead is to be expected. That said, they did not run SPEC yet.

forwardcom forum

Memory safety enforcement using CHERI

Memory safety enforcement using CHERI

Re: Memory safety enforcement using CHERI

Re: Memory safety enforcement using CHERI

Re: Memory safety enforcement using CHERI