MDS Attacks: Microarchitectural Data Sampling

On Jan 27, 2020, we (VUSec) disclose two "new" RIDL/MDS variants at the end of another (third) embargo, as described in our second RIDL addendum. We do not think either of these variants is particularly novel or interesting and they are simply "more RIDL" (focusing on ways to get data into microarchitectural buffers RIDL can leak from).

L1D Eviction Sampling (L1DES).
The first issue, which Intel refers to as L1D Eviction Sampling (L1DES), is a RIDL variant that leaks from L1D evictions (assigned CVE-2020-0549). It may seem that sometimes history does repeat itself, because this is again something that we had already shown in our original RIDL paper, as shown in Figure 6. In the camera-ready version of the RIDL paper, we also explicitly mentioned that, at every context switch, "the entire L1 cache needs to be flushed first since the entries go through the LFBs" to properly mitigate RIDL. We removed this sentence in the original version of the RIDL paper released on May 14, 2019, since Intel had not yet mitigated the RIDL variant based on L1D evictions (which would eventually become L1DES). Since then, we spent months trying to convince Intel that leaks from L1D evictions were possible and needed to be addressed.
On Oct 25, 2019, we reported to Intel that this variant would bypass their latest VERW mitigation (and so did a PoC shared with Intel on May 10, 2019), resulting in Intel finally acknowledging the L1D eviction issue and requesting another (L1DES) embargo. We learned that Intel had not found this issue internally and that the only other independent finder was the Zombieload team, which reported a PoC to Intel in May, 2019. The CacheOut team did an independent security analysis of L1DES. Our RIDL-L1DES PoC is available on GitHub.
Vector Register Sampling (VRS).
The second issue, which Intel refers to as Vector Register Sampling (VRS), is a RIDL variant that leaks from vector registers (assigned CVE-2020-0548). This variant shows that RIDL can also leak values that are never even stored in memory. In reality, this is possible with a small, 1-line of code variation of our 'alignment write' PoC (originally leaking values stored in memory using alignment faults), which we shared with Intel on May 11, 2019. Since then, we spent months trying to convince Intel that our 'alignment write' PoC and its variations needed to be properly addressed.
On Oct 1, 2019, we reported to Intel that a 1-line modification of our 'alignment write' PoC can leak vector register values, resulting in Intel requesting a new (VRS) embargo. We are not aware of other independent finders to acknowledge for VRS. Our RIDL-VRS PoC is available on GitHub.

In other news: RIDL leaks the root passwd hash in a few seconds

Meanwhile, our students have been playing with RIDL last year. As shown below, one of our bright students, Finn de Ridder, built a new RIDL-TAA exploit that can leak the entire root hash password in only 4 seconds (!!!).

Our latest optimized RIDL-TAA exploit can leak the root password hash from the /etc/shadow file on Linux in only 4 seconds.

On Nov 12, 2019, we (VUSec) disclosed TSX Asynchronous Abort (TAA), a new speculation-based vulnerability in Intel CPUs as well as other MDS-related issues, as described in our first RIDL addendum. In reality, this is no new vulnerability. We disclosed TAA (and other issues) as part of our original RIDL submission to Intel in Sep 2018. Unfortunately, the Intel PSIRT team missed our submitted proof-of-concept exploits (PoCs), and as a result, the original MDS mitigations released in May 2019 only partially addressed RIDL. You can read the full story below.

In particular, at the request of Intel, we withheld the following details on the original RIDL/MDS disclosure date:

TSX Asynchronous Abort (TAA). Intel's TSX hardware feature can be used to efficiently mount a RIDL attack even on allegedly non-vulnerable CPUs (with hardware mitigations).
Alignment faults. These can be used to trigger an exception, giving an attacker yet another way of leaking data. This attack vector seems to be fixed in the latest generation of Intel CPUs.
Flawed MDS mitigation. The initial mitigations against MDS clear the buffers by writing stale, potentially sensitive, data into these buffers, allowing an attacker to leak information despite mitigations being enabled.
The RIDL test suite. We can now release the RIDL test suite at https://github.com/vusec/ridl.

Impact

TL;DR: an attacker can mount a RIDL attack despite the in-silicon mitigations/microcode patches published in May 2019 being in place.

In particular, TSX Asynchronous Abort (TAA) allowed us to mount a practical RIDL exploit (as already shown in /etc/shadow leak presented in the RIDL paper) even on systems with in-silicon/microcode MDS mitigations enabled. Moreover, one of our brilliant Master students (Jonas Theis) used TAA to optimize our initial password-disclosing exploit to run in just 30 seconds (!!), rather than the initial 24 hours (as seen in the video below).

We leak the full root password hash from the /etc/shadow file on Linux in around 30 seconds using a TAA-optimized RIDL exploit.

Why only release this now?

In short, this was part of our original RIDL submission but was missed by Intel during the first embargo and a new last-minute embargo followed. Check out the updated timeline at the end of the page or keep reading for the full story.

On Sep 29, 2018, we submitted several proof-of-concept exploits (PoCs) for a number of RIDL variants to Intel. Despite our many attempts, we received no technical feedback/questions on our submission except that Intel was working on the mitigations.

In fact, due to a lack of transparency on Intel's part, we only got a complete picture on Intel's MDS disclosure plan on May 10, 2019, just 4 days before public disclosure. We were able to find the microcode updates published by Intel online and tested them on the same day. We quickly found that Intel's fixes did not fully mitigate the vulnerabilities we had reported in Sep 2018 and immediately informed Intel.

On May 13, 2019, just one day from the RIDL/MDS public disclosure date, Intel requested TAA and any other RIDL issues that were not mentioned in the MDS whitepaper to be placed under a new last-minute embargo until Nov 12, 2019. At the request of Intel, and to protect users, we complied to the new embargo, withheld several details from the RIDL paper (leaving only some traces of our results in Table I), and did not release our now public RIDL test suite.

On July 3, 2019, we finally learned that, to our surprise, the Intel PSIRT team had missed the PoCs from our Sep 29 submission, despite having awarded a bounty for it, explaining why Intel had failed to address - or even publicly acknowledge - many RIDL-class vulnerabilities on May 14, 2019.

On Oct 15, 2019, we learned that Intel had not found this issue internally and the only other independent finder was the Zombieload team, which disclosed TAA to Intel in April, 2019. The MEDUSA team later published an independent analysis using TAA.

On Oct 25, 2019, we tested Intel's latest microcode update, and still saw leaks with the VERW mitigation enabled, using the RIDL PoCs we shared with Intel in May 2019. We notified Intel and shared a polished PoC to make the issue clear. Intel requested a new embargo and yet suggested adding the following to our first RIDL addendum: "A new microcode update release by Intel in November is required to adequately address the issue".

On Nov 12, 2019, TAA and the other scheduled RIDL issues are disclosed. Unfortunately, we believe that, given the piecemeal (variant-by-variant) mitigation approach pursued by Intel, RIDL-class vulnerabilities won't disappear any time soon.

We are particularly worried about Intel's mitigation plan being PoC-oriented with a complete lack of security engineering and underlying root cause analysis, with minor variations in PoCs leading to new embargoes, and these "new" vulnerabilities remaining unfixed for lengthy periods. Unfortunately, until there is sufficient public / industry pressure, there seems to be little incentive for Intel to change course, leaving the public with a false sense of security. Slapping a year-long embargo after another (many news cycles apart) and keeping vulnerabilities (too) many people are aware of from the public for a long time is still a viable strategy.