These are the slides from by BSDCan talk on a system-level public-key trust system for FreeBSD.
This post is more of a release announcement for a whitepaper I’ve been working on for some time now. After a number of edits, I am prepared to release the full paper for public reading. I will be seeking to present this in the form of a talk at open-source, particularly socially-focused conferences.
The central thesis lies around an alternative infrastructure for (among other things) social applications that is fundamentally decentralized, democratic, and user-respecting. No, this is not a blockchain proposal! While I claim no credit for it, the timing really couldn’t have been better, with the news about FaceBook, Cambridge Analytica, and others breaking daily.
In the coming year, I will be seeking to implement the architecture I describe herein. I’ll be publicizing the paper, seeking non-profit funding and industry partners, and developing an open-source reference implementation.
But rather than keep talking, I’ll leave you to read the paper. Comments and other points of interest are welcome, and longer-form comments should be directed to me via email.
I posted a proposed countermeasure for the meltdown and spectre attacks to the freebsd-security mailing list last night. Having slept on it, I believe the reasoning is sound, but I still want to get input on it.
MAJOR DISCLAIMER: This is an idea I only just came up with last night, and it still needs to be analyzed. There may very well be flaws in my reasoning here.
Countermeasure: Non-Cacheable Sensitive Assets
The crux of the countermeasure is to move sensitive assets (ie. keys, passwords, crypto state, etc) into a separate memory region, and mark this non-cacheable using MTRRs or equivalent functionality on a different architecture. I’ll assume for now that the rationale for why this should work will hold.
This approach has two significant downsides:
- It requires modification of applications, and it’s susceptible to information leaks from careless programming, missing sensitive assets, old code, and other such problems.
- It drastically increases the cost to access sensitive assets (a main-memory access), which is especially punitive if you end up using sensitive asset storage as a scratchpad space
The upside of the approach is that it’s compatible with a move toward storing sensitive assets in secure memory or in special devices, such as a TPM, or the flash device I suggested in a previous post.
Programmatically, I could see this looking like a kind of “malloc with attributes” interface, one of the attributes being something like “sensitive”. I’ll save the API design for later, though.
In this rationale, I’ll borrow the terminology “transient operations” to refer to any instruction or portion thereof which is being executed, but whose effects will eventually be cancelled due to a fault. In architecture terminology, this is called “squashing” the operation. The rationale for why this will work hinges on three assumptions about how any processor pipeline and potential side-channel necessarily must work:
- Execution must obey dependency relationships (I can’t magically acquire data for a dependent computation, unless it’s cached somewhere)
- Data which never reaches the CPU core as input to a transient operation cannot make it into any side-channel
- CPU architects will squash all transient operations when a fault or mispredicted branch is discovered as quickly as possible, so as to recover execution units for use on other speculative branches
The meltdown attack depends on being able to execute transient operations that depend on data loaded from a protected address in order to inject information into a side-channel before a fault is detected. The cache and TLB states are critical to this process.For this analysis, assume the cache is virtually-indexed (see below for the physically-indexed cache case). Break down the outcomes based on whether the given location is in cache and TLB:
- Cache Hit, TLB Hit: You have a race between the TLB and cache coming back. TLBs are typically smaller, so they are unlikely to come back after the cache access. This will detect the fault almost immediately
- Cache Hit, TLB Miss: You have a race between a page-table walk (potentially thousands of cycles) and a cache hit. This means you get the data back, and have a long time to execute transient operations. This is the main case for meltdown.
- Cache Miss, TLB Hit: The cache fill operation strongly depends on address translation, which signals a fault almost immediately.
- Cache Miss, TLB Miss: The cache fill operation strongly depends on address translation, which signals a fault after a page-table walk. You’re stalled for potentially thousands of cycles, but you cannot fetch the data from memory until the address translation completes.
Note that both the cache-miss cases defeat the attack. Thus, storing sensitive assets in non-cacheable memory should prevent the attack. Now, if your cache is physically-indexed, then every lookup depends on an address translation, and therefore, fault detection, so you’re still safe.
A New Attack?
In my original posting to the FreeBSD lists, I evidently had misunderstood the Spectre attack, and ended up making up a whole new attack (!!) This attack is still defeated by the non-cacheable memory trick. The attack works as follows:
- Locate code in another address space, which can potentially be used to gather information about sensitive information
- Pre-load registers to force the code to access sensitive information
- Jump to the code, causing your branch predictors to soak up data about the sensitive information
- When the fault kicks you back out, harvest information from branch predictors
This is defeated by the non-cacheable store as well, by the same reasoning.
Aside: this really is a whole new class of attack.
High-Probability Defense Against the Spectre Attack
The actual spectre attack relies on causing speculative execution of other processes to cause cache effects which are detectable within our process. The non-cacheable store is not an absolute defense against this, however, it does defeat the attack with very high probability.
The reasoning here is that any branch of speculative execution is highly unlikely to last longer than a full main memory access (which is possibly thousands of cycles). Branch mispredictions in particular will likely last a few dozen cycles at most, and execution of the mispredicted branch will almost certainly be squashed before the data actually arrives. Thus, it can’t make it into any side-channel.
This is a potential defense against speculative execution-based side-channel attacks, which is based on restoring the dependency between fault detection and memory access to sensitive assets, and incurring a general access delay to sensitive assets.
This blocks any speculative branch which will eventually cause a fault from accessing sensitive information, since doing so necessarily depends on fault detection. This has the effect of defeating attacks that rely on speculative execution of transient operations on this data before the fault is detected.
This also defeats attacks which observe side-channels manipulated by speculative branches in other processes that can be made to access sensitive data, as the delay makes it extremely unlikely that data will arrive before the branch is squashed.
Assuming the reasoning here is sound I plan to start working on implementing this for FreeBSD immediately. Additionally, this defense is a rather coarse mechanism which repurposes MTRRs to effectively mark data as “not to be speculatively executed”. A new architecture, such as RISC-V can design more refined mechanisms. Finally, the API for this defense ought to be designed so as to provide a general mechanism for storage of sensitive assets in “a secure location” which can be non-cacheable memory, a TPM, a programmed flash device, or something else.
This week has seen the disclosure of a devastating vulnerability: meltdown and its sister vulnerability, spectre. Both are a symptom of a larger problem that has evidently gone unnoticed for the better part of 50 years of computer architecture: the potential for side-channels in processor pipeline designs.
Aside: I will confess that I am rather ashamed that I didn’t notice this, despite having a background in all the knowledge areas that should have revealed it. It took me all of three sentences of the paper before I realized what was going on. Then again, this somehow got by every other computer scientist for almost half a century. The conclusion seems to be that we are, all of us, terrible at our jobs…
A New Class of Attack, and Implications
I won’t go in to the details of the attacks themselves; there are already plenty of good descriptions of that, not the least of which is the paper. My purpose here is to analyze the broader implications with particular focus on the operating systems security perspective.
To be absolutely clear, this is a very serious class of vulnerabilities. It demands immediate and serious attention from both hardware architects and OS developers, not to mention people further up the stack
The most haunting thing about this disclosure is that it suggests the existence of an entire new class of attack, of which the meltdown/spectre attacks are merely the most obvious. The problem, which has evidently been lurking in processor architecture design for half a century has to do with two central techniques in processor design:
- The fact that processors tend to enhance performance by making commonly-executed things faster. This is done by a whole host of features: memory caches, trace caches, and branch predictors, to name the more common techniques.
- The fact that processor design has followed a general paradigm of altering the order in which operations are executed, either by overlapping multiple instructions (pipelining), executing multiple instructions in parallel (superscalar), executing them out of order and then reconciling the results (out-of-order), and executing multiple possible outcomes and keeping only one result (speculative).
The security community is quite familiar with the impacts of (1): it is the basis for a host of side-channel vulnerabilities. Much of the work in computer architecture deals with implementing the techniques in (2) while maintaining the illusion that everything is executed one instruction at a time. CPUs are carefully designed for this with regard to processor state according to the ISA, carefully tracking which operations depend on which and keeping track of the multiple possible worlds that arise from speculation. What evidently got by us is that the side-channels provided by (1) provide an attacker with the ability to violate the illusion of executing one instruction at a time.
What this almost certainly means is that this is not the last attack of this kind we will see. Out-of-order speculative execution happens to be the most straightforward to attack. However, I can sketch out ways in which even in-order pipelines could potentially be vulnerable. When we consider more obscure architectural features such as trace-caches, branch predictors, instruction fusion/fission, and translation units, it becomes almost a certainty that we will see new attacks in this family published well into the future.
A complicating factor in this is that CPU pipelines are extraordinarily complicated, particularly these days. (I recall a tech talk I saw about the SPARC processor, and its scope was easily as large as the FreeBSD core kernel, or the HotSpot virtual machine.) Moreover, these designs are typically closed. There are a lot of different ways a core can be designed, even for a simple ISA, and we can expect entire classes of these design decisions to be vulnerable.
: The x86 architecture tends to have translation units that are significantly more complex than the cores themselves, and are likely to be a nest of side-channels.
The implications of this for OS (and even application) security are rather dismal. The one saving virtue in this is that these attacks are read-only: they can be used to exfiltrate data from any memory immediately accessible to the pipeline, but it’s a pretty safe assumption that they cannot be used to write to memory.
That boon aside, this is devastating to OS security. As a kernel developer, there aren’t any absolute defenses that we can implement without a complete overhaul of the entire paradigm of modern operating systems and major revisions to the POSIX specification. (Even then, it’s not certain that this would work, as we can’t know how the processors implement things under the hood.)
Therefore, OS developers are left with a choice among partial solutions. We can make the attacks harder, but we cannot stop them in all cases. The only perfect defense is to replace the hardware, and hope nobody discovers a new side-channel attack.
Attack Surface and Vulnerable Assets
The primary function of these kinds of attacks is to exfiltrate data across various isolation boundaries. The following is the most general description of the capabilities of the attack, as far as I can tell:
- Any data that can be loaded into a CPU register can potentially be converted into the execution of some number of events before a fault is detected
- These execution patterns can affect the performance of future operations, giving rise to a side-channel
- These side-channels can be read through various means (note that this need not be done by the same process)
This gives rise to the ability to violate various isolation boundaries (though only for the purposes of observing state):
- Reads of kernel/hypervisor memory from a guest (aka “meltdown”)
- Reads of another process’ address space (aka “spectre”)
- Reads across intra-process isolation mechanisms, such as different VM instances (this attack is not named, but constitutes, among other things, a perfect XSS attack)
A salient feature of these attacks is that they are somewhat slow: as currently described, the attack will incur 2n + 1 cache faults to read out n bits (not counting setup costs). I do not expect this to last, however.
The most significant danger of this attack is the exfiltration of secret data, ideally small and long-lived data objects. Examples of key assets include:
- Master keys for disk encryption (example: the FreeBSD GELI disk encryption scheme)
- TLS session keys
- Kerberos tickets
- Cached passwords
- Keyboard input buffers, or textual input buffers
More transient information is also vulnerable, though there is an attack window. The most vulnerable assets are those keys which necessarily must remain in memory and unencrypted for long periods of time. The best example of this I can think of is a disk encryption key.
Imperfect OS Defense Mechanisms
Unfortunately, operating systems are limited in their ability to adequately defend against these attacks, and moreover, many of these mechanisms incur a performance penalty.
Separate Kernel and User Page Tables
The solution adopted by the Linux KAISER patches is to unmap most of the kernel from user space, keeping only essential CPU metadata and the code to switch page tables mapped. Unfortunately, this requires a TLB flush for every system call, incurring a rather severe performance penalty. Additionally, this only protects from access to a kernel address; it cannot stop accesses to other process address spaces or crossing isolation boundaries within a process.
Make Access Attempts to Kernel Memory Non-Recoverable
An idea I proposed on the FreeBSD mailing lists is to cause attempted memory accesses to a kernel address range result in an immediate cache flush followed by non-recoverable process termination. This avoids the cost of separate kernel and user page tables, but is not a perfect defense in that there is a small window of time in which the attack can be carried out.
Special Handling of Sensitive Assets
Another potential middle-ground is to handle sensitive kernel assets specially, storing them in a designated location in memory (or better yet, outside of main memory). If a main-memory store is used, this range alone can be mapped and unmapped when crossing into kernel space, thus avoiding most of the overhead of a TLB flush. This would only protect assets that are stored in this manner; however. Anything else (most notably the stacks for kernel threads) would remain vulnerable.
Userland programs are even less able to defend against such attacks than the kernel; however, there are architectural considerations that can be made.
Avoid Holding Sensitive Information for Long Periods
As with kernel-space sensitive assets, one mitigation mechanism is to avoid retaining sensitive assets for long periods of time. An example of this is GPG, which keeps the users’ keys decrypted only for a short period of time (usually 5 minutes) after they request to use a key. While this is not perfect, it limits attacks to a brief window, and presents users with the ability create practices which ensure that they shut off other means of attack during this window.
Minimize Potential for Remote Execution
As JIT compilers and interpreters are a common mechanism for limited execution of remote code, they are in a position to attempt to mitigate many of these attacks (there is an LLVM patch out to do this). This is once again an imperfect defense, as they are on the wrong side of Rice’s theorem. There is no general mechanism for detecting these kinds of attacks, particularly where multiple threads of execution are involved.
Interim Hardware Mitigations
The only real solution is to wait for hardware updates which fix the vulnerabilities. This take time, however, and we would like some interim solution.
One possible mitigation strategy is to store and process sensitive assets outside of the main memory. HSMs and TPMs are an example of this (though not one I’m apt to trust), and the growth of the open hardware movement offers additional options. One in particular- which I may put into effect myself -uses a small FPGA device (such as the PicoEVB) programmed with an open design as such a device.
More generally, I believe this incident highlights the value of these sorts of hardware solutions, and I hope to see increased interest in developing open implementations in the future.
The meltdown and spectre attacks- severe as they are by themselves -represent the beginning of something much larger. A new class of attack has come to the forefront, which enables data exfiltration in spite of hardware security mechanisms, and we can and should expect to see more vulnerabilities of this kind in the future. What makes these vulnerabilities so dangerous is that they directly compromise hardware protection mechanisms on which OS security depends, and thus there is no perfect defense at the OS level against them.
What can and should be done is to adapt and rethink security approaches at all levels. At the hardware level, a paradigm shift is necessary to avoid vulnerabilities of this kind, as is the consideration that hardware security mechanisms are not as absolute as has been assumed. At the OS level, architectural changes and new approaches can harden an OS against these vulnerabilities, but they generally cannot repel all attacks. At the user level, similar architectural changes as well as minimization of attack surfaces can likewise reduce the likelihood and ease of attack, but cannot repel all such attacks.
As a final note, the trend in information security has been increasing focus on application, and particularly web application security. This event, as well as the fact that a relatively simple vulnerability managed to evade detection by the entire field of computer science for half a century strongly suggests that systems security is not at all the solved problem it is commonly assumed to be, and that new thinking and new approaches are needed in this area.
I’ll open with a confession: the idea for this recipe evidently originates from Hannibal Lecter’s recipe collection! A chance video clip I happened across showed a recipe card, and I found the contents somewhat intriguing. In particular, it hadn’t occurred to me to use heavy cream in a braise. I ended up crafting this recipe as a result.
Its origins aside, I made this on Christmas day 2017, and it came out marvelously, aside from the usual errors I make for braising: too much liquid and typically about twice as much of the sauteed components as I need. The ingredient list here is normalized to 1 lb of meat and attempts to adjust for these errors
I use skirt-steak here, as its loose texture is ideal for soaking up the liquid in a braise. Substitute other cuts as desired, but skirt steak did quite well.
- 1 lb beef skirt steak
- 1/2 cup diced porcetta (or pork belly)
- 1/2 cup heavy cream
- 1/2 lemon
- 1 cup full-bodied red wine (Cabernet, Burgundy, or Bordeaux)
- 1/2 cup brandy
- 4-6 sprigs fresh thyme
- 2-3 sprigs fresh leafy rosemary
- 1 bay leaf
- 1/2 shallot, diced
- 3-4 cloves garlic, smashed
- 1/2 yellow onion, chopped
- mushrooms: chanterelle, porcini, portabella, chopped
- ground kosher salt and mixed peppercorns
Oven temperature: 275 F
I recommend some manner of fresh pasta and fresh-baked bread as a side.
The procedure has two components: preparation about 12-24 hours in advance, and actual cooking.
About a day in advance, prepare the meat and mushrooms to be marinated.
The ground salt and pepper aren’t shown. For this, I use my trusty mortar and pestle (one of the best kitchen tools I ever bought, really). First, cut the skirt steak up and rub it down with the salt and pepper mix.
Next, smash the garlic, chop the aromatic mushrooms (chanterelle and porcini, or any others you use), layer the meat, garlic and mushrooms in a container.
Next, pour in the wine. I usually include a little olive oil to dissolve the non-polar components (most notably garlic essence) and mix them in. You can add a bit of brandy as well. (One thing, though: never use acidic marinades, which I discovered by accident once)
Put this in the refrigerator and let it sit overnight. After a few hours, some amount of the wine will be absorbed into the skirt steak, and some of the meat juices will mix with the wine. I waited a few hours, then pressed the mixture down below the level of the liquid (not shown).
Sautee and Braising
The main cooking step is sauteeing and braising. Begin by removing the meat and allowing it to drain. Also, strain the mushrooms out of the marinating liquid, shop the onion, shallots, and the rest of the mushrooms and set them aside. Be sure to keep all of the marinating liquid and everything that drains from the meat.
Sear the meat with a bit of oil. Just give it a nice sear; don’t cook it all the way through.
Now add the diced porchetta and brown it (I forgot to do this, and added it later). Once this is done, remove most of the fat, then splash in the brandy and slosh it around to dissolve everything that’s stuck to the pan. (Skirt steak is very lean, so it may not be necessary to remove any fat.) Then, add in the onion, shallots, mushrooms, and garlic and sautee them until nice and brown.
After sauteeing, add in the marinating liquid, then squeeze in the lemon.
(Note that I added the diced porchetta here, instead of earlier)
Now add in the bay leaf, thyme and rosemary sprigs (I usually tear the leaves off the rosemary)
Simmer this until the liquid boils down to about 2/3-1/2 original volume (remember that I ended up with about twice as much sauteed mushrooms and onions as I wanted here). Once this is done, turn off the heat and add the heavy cream.
A word on heavy cream: it’s magic. It’s one of the very few things that can blend polar (water-soluble) and non-polar (oil-soluble) things together in happy harmony, and keep it that way. As far as braising goes, it’s a bit daunting to cook milk, but remember, heavy cream is basically another form of butter.
Place the meat in amongst the rest, spaced out evenly.
Cover the braising container and put it in the oven for about 20 minutes. Remove, turn the meat over, then put back in for about 10 minutes. In this case, I cut into the meat to determine if it was done.
At the end, there will be some separation of the oil from the rest. However, you can mix this back together and it will stay emulsified. After removing from the oven, place the cover on the braising container with a small gap to allow steam to escape, and let it sit for about 5-10 minutes before serving.
This recipe refrigerate and reheats quite well, and will only slightly separate even when refrigerated. Reheating will re-cook the meat some, so I suggest pulling it apart with forks prior to cold storage.
These are the slides from my vBSDcon talk on GELI work.