I have just completed (for some value of “complete”) a project to refactor the FreeBSD EFI boot and loader code. This originally started out as an investigation of a possible avenue in my work on GELI full-disk encryption support for the EFI boot and loader, and grew into a project in its own right.
More generally, this fits into a bunch of work I’m pursuing or planning to pursue in order to increase the overall tamper-resistance of FreeBSD, but that’s another article.
To properly explain all this, I need to briefly introduce both the FreeBSD boot and loader architecture as well as EFI.
FreeBSD Boot Architecture
When an operating system starts, something has to do the work of getting the kernel (and modules, and often other stuff) off the disk and into memory, setting everything up, and then actually starting it. This is the boot loader. Boot loaders are often in a somewhat awkward position: they need to do things like read filesystems, detect some devices, load configurations, and do setup, but they don’t have the usual support of the operating system to get it done. Most notably, they are difficult to work with because if something goes wrong, there is very little in the way of recovery, debugging, or even logging.
Moreover, back in the old days of x86 BIOS, space was a major concern: the BIOS pulled in the first disk sector, meaning the program had to fit into less than 512 bytes. Even once a larger program was loaded, you were still in 16-bit execution mode.
To deal with this, FreeBSD adopted a multi-stage approach. The initial boot loader, called “boot”, had the sole purpose of pulling in a more featureful loader program, called “loader”. In truth, boot consisted of two stages itself: the tiny boot block, and then a slightly more powerful program loaded from a designated part of the BSD disklabel.
The loader program is much more powerful, having a full suite of filesystem drivers, a shell, facilities for loading and unloading the kernel, and other things. This two-phase architecture overcame the severe limitations of the x86 BIOS environment. It also allowed the platform-specific boot details to be separated from both the loader program and the kernel. This sort of niceness is the hallmark of a sound architectural choice.
Inside the loader program, the code uses a set of abstracted interfaces to talk about devices. Devices are detected, bound to a device switch structure, and then filesystem modules provide a way to access the filesystems those devices contain. Devices themselves are referred to by strings that identify the device switch managing them. This abstraction allows loader to support a huge variety of configurations and platforms in a uniform way.
The Extensible Firmware Interface
In the mid-2000’s, the Extensible Firmware Interface started to replace BIOS as the boot environment on x86 platforms. EFI is far more modern, featureful, abstracted, and easy to work with than the archaic, crufty, and often unstandardized or undocumented BIOS. I’ve written boot loaders for both; EFI is pretty straightforward, where BIOS is a tarpit of nightmares.
One thing EFI does is remove the draconian constraints on the initial boot loader. The firmware loads a specific file from a filesystem, rather than a single block from a disk. The EFI spec guarantees support for the FAT32 filesystem and the GUID Partition Table, and individual platforms are free to support others.
Another thing EFI does is provide abstracted interfaces for things like device IO, filesystems, and many other things. Devices- both concrete hardware and derived devices such as disk partitions and network filesystems are represented using “device handles”, which support various operational interfaces through “protocol interfaces”, and are named using “device paths”. Moreover, vendors and operating systems authors alike are able to provide their own drivers through a driver binding interface, which can create new device handles or bind new protocol interfaces to existing ones.
FreeBSD Loader and EFI Similarities
The FreeBSD loader and the EFI framework do many of the same things, and they do them in similar ways most of the time. Both have an abstracted representation of devices, interfaces for interacting with them, and a way of naming them. In many ways, the FreeBSD loader framework is prescient in that it did many of the things that EFI ended up doing.
The one shortcoming of FreeBSD loader is in the lack of support for dynamic device detection, also known as “hotplugging”. When FreeBSD’s boot architecture was created (circa 1994), hotplugging was extremely uncommon: most hardware expected to be connected permanently and remain connected for the duration of operation. Hence, the architecture was designed around a model of one-time static detection of all devices, and the code evolved around that assumption. Hot-plugging was added to the operating system itself, of course, but there was little need for it in the boot architecture. When EFI was born (mid 2000’s), hot-pluggable devices were common, and so supporting them was an obvious design choice.
EFI does this through its driver binding module, where drivers register a set of callbacks that check whether a device is supported, and then attempt to attach to it. When a device is disconnected, another callback is invoked to disconnect it. FreeBSD’s loader, on the other hand, expects to detect all devices in a probing phase during its initialization. It then sets up additional structure (most notably, its new bcache framework) based on the list of detected devices. Some phases of detection may rely on earlier ones; for example, the ZFS driver may update some devices that were initially detected as block devices.
As I mentioned, my work on this was originally a strategy for implementing GELI support. A problem with the two-phase boot process is that it’s difficult to get information between the two phases, particularly in EFI, where all code is position-independent, no hard addresses are guaranteed, and components are expected to talk through abstract interfaces. (In other words, it rules out the sort of hacks that the non-EFI loader uses!) This is a problem for something like GELI, which has to ask for a password to unlock the filesystem (we don’t want to ask for a password multiple times). Also, much of what I was having to implement for GELI with abstract devices and a GPT partition driver and such ended up mirroring things that already existed in the EFI framework.
I ended up refactoring the EFI boot and loader to make more use of the EFI framework, particularly its protocol interfaces. The following is a summary of the changes:
- The boot and loader programs now look for instances of the EFI_SIMPLE_FILE_SYSTEM_PROTOCOL, and use that interface to load files.
- The filesystem backend code from loader was moved into a driver which does the same initialization as before, then attaches EFI_SIMPLE_FILE_SYSTEM_PROTOCOL interfaces to all device handles that host supported filesystems.
- This is accomplished through a pair of wrapper interfaces that translate EFI_SIMPLE_FILE_SYSTEM_PROTOCOL and the FreeBSD loader framework’s filesystem interface back and forth.
- I originally wanted to move all device probing and filesystem detection into the EFI driver model, where probing and detection would be done in callbacks. However, this didn’t work primarily because the bcache framework is strongly coupled to the static detection way of doing things.
- Interfaces and device handles installed in boot can be used by loader without problems. This provides a way to pass information between phases.
- The boot and loader programs can also make use of interfaces installed by other programs, such as GRUB, or custom interfaces provided by open-source firmware.
- The boot and loader programs now use the same filesystem backend code; the minimal versions used by boot have been discarded.
- Drivers for things like GELI, custom partition schemes, and similar things can work by creating new device nodes and attaching device paths and protocol interfaces to them.
I sent an email out to -hackers announcing the patch this morning, and I hope to get GELI support up and going in the very near future (the code is all there; I just need to plug it in to the EFI driver binding and get it building and running properly).
For anyone interested, the branch can be found here: https://github.com/emc2/freebsd/tree/efize