13 January 2016 / Mozilla

A type-safe and zero-allocation library for reading and navigating ELF files

Over the Christmas break I made a start implementing an ELF library in Rust. This was interesting for me since I've not done a lot of library work before. I hope it will be interesting for you to read about it because it shows off some of the important parts of Rust, and exercises a few dark corners you might not be familiar with.

The library itself still needs a lot of work - more at the end of the post. You can find it in the xmas-elf repo.

ELF (Executable and Linkable Format) is a binary format for storing object files - both before and after linking. It is used on Linux, BSD, and a lot of other (mostly Unix-like) platforms. An ELF file starts with a header, then has an optional program header table, then a bunch of sections, which for executables (c.f., pre- linking) are grouped into segments, finally it has a section header table. The xmas-elf library reads the binary data into Rust data structures and provides functions to navigate the sections.

Zero-allocation and type-safe libraries

One of the great advantages of Rust is that you can rely on the compiler to completely track the memory behind pointers and ensure you never have a pointer to freed memory. Whilst this is easy enough to do manually for a small program, in a large program it gets really hard. And if you are passing pointers backwards and forwards across a library boundary? Manual isn't a sane option.

With the Rust borrow checker looking after you though, you can go wild. We can take data from client code, return pointers into it, keep pointers to it hanging around, whatever. We remain safe in the knowledge that the pointers will never outlive the underlying data. That means we can avoid copying data and allocating memory and make some really fast libraries. In xmas-elf, we don't allocate any memory at all^[1].

Rust (like C) allows the programmer to define the layout of data structures in memory. Furthermore, Rust data structures can be zero-overhead. That means we can just transmute (cast, in C terms) chunks of memory to these data structures and get a view of the raw data as structured data. Furthermore, Rust has value semantics for its data structures (called interior allocation), so a nested struct will reside inline in its owning struct in memory, there is no indirection via a pointer as in Java or many other high-level languages. So we can even transmute nested data structures.

The techniques we use in the library are not (memory) safe, the compiler cannot verify the code. A language that insisted all code must be safe could not use these techniques. However, Rust allows for unsafe blocks of code where we can. Unsafe code is not infectious - safe code can use unsafe code, but the compiler ensures that unsafety is contained, minimising the surface area for memory safety bugs.

When I (and I believe others) talk about a library being type safe, I mean something very different from when a programming language is type safe (although one could argue that if the language isn't, then the library can't be). In the library sense, I mean that one cannot use a piece of data in a way that was not intended. The classic example from traditional C libraries is where they use an integer for everything, and so a function wants a parameter which means one thing, but you pass it data about another thing (because both are just integers).

Rust lets you write type safe libraries by virtue of its rich data types (and because the language itself is strongly and statically typed). By using structs rather than offsets we prevent accessing incorrect fields, by using enums rather than integers we can't get the wrong kind of variant, and we can't get variants which don't exist, nor miss variants which do, due to exhaustive checks. The newtype pattern means that even where we must use an integer, we can make sure we don't confuse integers which mean different things.

How xmas-elf works

We start with the input data, which is just a bunch of bytes. There is a helper function (open_file) which opens a file and reads it into a Vec<u8>, but it's not a core part of the library. We deal with the source data as a slice of bytes, &'a [u8].

Rust ensures that this slice will not outlive the original data. That 'a is an upper bound on how long anything backed by the source data can live. We use this one lifetime throughout the library, threading it around functions and data structures (all of which are generic, but basically only get this one lifetime. This would be a perfect use case for lifetime-parametric modules).

The next key technique is that we can cause Rust data structures to have a precise layout in memory. That means we can take a reference to a piece of memory and transmute it into a reference to a data structure. The data structure is backed by the memory we started with, but we use it like any other data.

Rust helps us in two ways with this: it would be dangerous to mutate such data, since others could also reference it (possibly with a different view on it). By using & references instead of &mut references, Rust ensures we won't mutate the underlying data. Secondly, we can't have these data structures outlive the underlying memory. Once that memory is freed, it would be dangerous to access the data structure. By giving the reference to the data structure the same lifetime as the memory, Rust ensures this won't happen. When we transmute some part of our &'a [u8] data into a Foo, it must have type &'a Foo. Rust then ensures we can't go wrong.

You can see the code for this here;

Of course, transmuting is unsafe and we require unsafe blocks for it. But if we don't read the memory out of bounds, and we setup the lifetimes correctly (as described above), then using the resulting data structures is perfectly safe. This is a good example of how Rust helps isolate unsafe operations into explicit unsafe blocks.

There is a bit of friction with enums. Many data types in ELF ought to be representable as enums, they use an integer for each variant, but then they reserve a bunch integers for processor-specific use (or other uses). That means there is a range of values which are specified, but not as individual variants. There is no Rust type that matches these semantics and has the required layout in memory. Therefore we must use a two step process.

First, we use a newtype wrapping an integer. This is used as a field in a data structure, etc., so that when we take a view on memory it gets mapped to this newtype. That gives us some degree of type safety, but doesn't get us all the way there. Whenever we access these fields, we use a getter method, it converts the newtype to an enum. That enum has variants for each defined value, and also variants for each of the undefined ranges, these latter variants have a single field with the underlying value in. Because there is no way to indicate to Rust the ranges of these fields, Rust must use a discriminant in these enums (see below for more discussion), which means the in-memory representation is not the same as the underlying data (thus why we need the newtype). Using these enums should be safe and ergonomic. Unfortunately converting between the two representations requires big, ugly match expressions (at least they are internal to the library and in no way exposed to the user).

(There is a language-level solution for this - range types, I don't expect to see them in the near future though, if at all).

repr and other language features

One of the key features that makes the xmas-elf work is that we have guarantees about the layout of data structures in memory. This does not come by default. Unless the programmer signals otherwise, the compiler is free to optimise your data structures however it likes.

The way you specify the layout of a data structure is with repr attributes. The most common on structs is #[repr(C)], which tells the compiler to make the struct C-compatible. That is, the struct is laid out in the same way that a C compiler would. This basically means in the order that the struct is written, with padding so that fields respect there natural alignment. This is what we want for xmas-elf. In some cases, you want that layout without the padding, in which case you should use #[repr(packed)]. Note that you cannot take a reference to a field in a packed struct.

We also specify the representation of enums. For example, we use #[repr(u16)] to cause the compiler to represent the enum as an unsigned 16 bit integer. You can use any integer type for this. It only works with C-like enums. We also sometimes need to give some variants specific values, e.g., enum OsAbi { OpenBSD = 0x0C }.

Finally, we should be aware that if we implement Drop for a type, then the compiler adds a hidden field to that type called the drop flag. That should get removed from the language some day. But for now, if you implement Drop then you must take it into account when you transmute. We can't do that if we are transmuting slices of immutable binary data. Luckily, the compiler will at least warn us if a #[repr(C)] type would get a drop flag.

Slices and fat pointers

The ELF spec defines not just individual objects in the binary data, but sequences of objects. We would like to represent these as slices in Rust, and we can, but it takes a little more work than for individual objects.

We can't use fixed size arrays, because we don't know the length of these sequences statically (actually in one or two cases we do). So we must use slices which are variable length.

We start the same way as for individual objects, we have an index into the slice of binary data, and we transmute that into a pointer to the Rust type, in this case &'a [T], rather than &'a T. However, a reference to a slice is represented as a fat pointer containing the actual pointer to the data and the length of the slice. The result we have so far will have the length of the slice we transmuted (in bytes). Which is not the length we want (and is measured in the number of objects). So we have to set that length field. We do that via another transmute to the std::raw::Slice type, which is a struct which matches the compiler's representation of the slice pointer.

Code: parsing.rs:25:34

Strings

Reading a string is similar to reading a slice, except that we don't know the length in advance, we have to walk the string until we find a zero (since these are null-terminated strings). The ELF strings are ASCII strings, but since an ASCII string is bitwise-compatible with a UTF-8 (Rust) string, that aspect doesn't matter here.

Then we create the fat pointer to the string. A string is just a slice, so we can use std::raw::Slice again. This time, we make it from scratch using the pointer to the start of the string and the length we found earlier. Then we transmute the Slice to a &'a str (note that lifetime again).

Code: parsing.rs:38:49

32 and 64 bit versions

ELF comes in both 32 and 64 bit flavours. Most of the time, this just means that the size of some fields change. Sometimes the ordering changes too. In the latter case we just need 32 and 64 bit versions of the data structure. I use macros to reduce some code duplication. Where just the size changes, I use generic data structures where the actual type parameters are either P32 or P64 which just alias u32 and u64. I then need an enum for the two possibilities. For example, for SectionHeader:

pub enum SectionHeader<'a> {
    Sh32(&'a SectionHeader_<P32>),
    Sh64(&'a SectionHeader_<P64>),
}

pub struct SectionHeader_<P> {
    name: u32,
    type_: ShType_,
    flags: P,
    address: P,
    ...
}

The impl for SectionHeader has getter methods for each field. There are a lot of macros for generating these getters, and for other code de-duplication.

I'm not particularly happy with this solution. There is no type-safety - we must be in either 32 or 64 bit mode, they can't be mixed, but this is not enforced by the types. And the code feels a bit crufty - it's neither elegant nor efficient.

One alternative is using a trait instead of an enum. This would skip some of the matching, but still needs a lot of boilerplate, and it doesn't address the safety question.

Going forward

This was just a toy mini-project, so I probably won't put much more effort into it. However, if anyone would find it useful, please let me know and I'll make some improvements.

Here's an incomplete list of things I'd like to work on:

Extract the parsing module into a crate - seems like it might be generally useful (in fact I'm using a copy and pasted version in another project already).
Flesh out some of the enums - I didn't add all possibilities, only the most common ones.
Documentation and testing - both sorely lacking.
Processor and OS extensions - ELF has lots of them and they are necessary to implement in order to be useful. I've only implemented the core spec so far.
Consistent naming - one of the hard bits of library design is naming. At the moment, naming is a bit inconsistent, it should be tidied up.
More newtypes - I use integer types in a lot of places where I should use newtype-wrapped integers.
What should be unsafe? - The functions in the parsing module are horribly unsafe. At the moment they are safe functions with unsafe bodies. However, they don't check their preconditions (and sometimes can't), and are vulnerable to causing memory safety errors if passed bad data. So, perhaps they should be unsafe functions. That would be much less ergonomic though. Not sure how to handle this question. (This is a relevant recent blog post).
The library is a bit panic-happy at the moment. I would prefer to return Result in more places.
See also, issue list.

Update

There were some interesting comments about this post on r/rust. Some of the improvements suggested:

Use iterators instead of allocating a Vec - I definitely want to do this!
Use slice::from_raw_parts rather than raw::Slice- I should do this too, it boils down to the same thing, but better to abstract what you can
Use CStr rather than handling strings manually - currently, this would walk the string twice vs just once in the manual approach. If that gets fixed, then using CStr would be better (abstracts away fiddly code and checks for UTF-8 compatibility.

Footnotes

Fundamentally, in normal use, we don't allocate anywhere. The only time we need to allocate is where we have a collection of pointers to data. In this case the data itself is not copied, we keep pointers into the underlying memory. However, the collection of pointers itself must be variable length (which prevents using the stack) and has no representation in the underlying data. Thus we use a Vec, which allocates memory.

An example of where this happens: getting a list of all strings in the string table. The strings themselves are just a view on the underlying memory, but the list of string pointers is a Vec. Note that looking up a specific string from the table (the usual case) does not need to allocate. Getting the whole table should be unusual.

where we allocate ↩︎