Settings

Theme

Speculative execution, variant 4: speculative store bypass

bugs.chromium.org

184 points by brandon 8 years ago · 21 comments

Reader

kashyapc 8 years ago

If you are using Linux-based virtualization (KVM), besides requiring updated kernel and Intel microcode (which is not yet available), you would also need updates for relevant layers: QEMU and libvirt. Patches are posted[1][2].

Virtual Machines now need to be exposed a new Intel CPU feature flag: 'ssbd' (Speculative Store Bypass Disable).

On microcode, from Red Hat's blog post[3]:

In many (but not all) cases, full mitigation will also require updated microcode from the system microprocessor vendor. Red Hat intends to ship updated microcode as a convenience to our customers as it is made available to us. In the interim, customers are strongly advised to contact their OEM, ODM, or system manufacturer to receive this via a system BIOS update.

[1] https://www.redhat.com/archives/libvir-list/2018-May/msg0156...

[2] https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg047...

[3] https://www.redhat.com/en/blog/speculative-store-bypass-expl...

my123 8 years ago

AMD guidance:

https://developer.amd.com/wp-content/resources/124441_AMD64_...

(setting an CPU-specific MSR and it's done for current CPUs, no microcode updates required.)

https://www.amd.com/en/corporate/security-updates has : "We have not identified any AMD x86 products susceptible to the Variant 3a vulnerability in our analysis to-date."

  • kashyapc 8 years ago

    For AMD, in context of virtualization — you would need to also expose a new CPUID flag: 'virt-ssdb', which all hypervisor vendors will expose to guests on AMD hosts. More from the libvirt patch[1]:

    Some AMD processors only support a non-architectural means of enabling Speculative Store Bypass Disable. To allow simplified handling in virtual environments, hypervisors will expose an architectural definition through CPUID bit 0x80000008_EBX[25]. This needs to be exposed to guest OS running on AMD x86 hosts to allow them to protect against CVE-2018-3639.

    Note that since this CPUID bit won't be present in the host CPUID results on physical hosts, it will not be enabled automatically in guests configured with "host-model" CPU unless using QEMU version >= 2.9.0. Thus for older versions of QEMU, this feature must be manually enabled using policy=force. Guests using the "host-passthrough" CPU mode do not need special handling.

    [1] https://www.redhat.com/archives/libvir-list/2018-May/msg0156...

  • tedunangst 8 years ago

    I like that it's specex variant 4 and spectre variant 3. Keeps everybody sharp.

    • bonzini 8 years ago

      Special register read is called "variant 3a" because it allows you to break the privilege level separation like Meltdown and, back in November, Meltdown was called "variant 3". Variant 1 was conditional-branch Spectre (speculative out of bounds accesses) while variant 2 was indirect-branch Spectre (the one that could be used to read host memory from a virtual machine).

ENOTTY 8 years ago

These are the links I found most explanatory

https://bugs.chromium.org/p/project-zero/issues/detail?id=15...

https://software.intel.com/sites/default/files/managed/b9/f9...

https://software.intel.com/sites/default/files/managed/c5/63...

https://blogs.technet.microsoft.com/srd/2018/05/21/analysis-...

https://developer.amd.com/wp-content/resources/124441_AMD64_...

https://www.intel.com/content/www/us/en/security-center/advi... uCode update is only for variant 3a (MSR read) and for the global disable bit in the MSR. The standard mitigation is still LFENCE.

https://docs.microsoft.com/en-us/cpp/security/developer-guid... vulnerable code examples

swonderl 8 years ago

Explained: https://www.redhat.com/en/blog/speculative-store-bypass-expl...

cesarb 8 years ago

A commenter over at arstechnica (https://arstechnica.com/gadgets/2018/05/new-speculative-exec...) found an old article explaining the optimization which led to this vulnerability: "Faster Load Times - Intel Core versus AMD's K8 architecture" https://www.anandtech.com/show/1998/5

pedro84 8 years ago

Additional vendor info:

https://developer.arm.com/support/arm-security-updates/specu...

https://blogs.technet.microsoft.com/srd/2018/05/21/analysis-...

https://www.intel.com/content/www/us/en/security-center/advi...

exikyut 8 years ago

Possibly completely unrelated question (this stuff is firmly over my head): toward the end of the first PoC there's

      /* if we don't break the loop after some time when it doesn't
  work, in NO_INTERRUPTS mode with SMP disabled, the machine will lock
  up */
The bit at the top of the that says

  ======== Demo code (no privilege boundaries crossed) ========
is suggestive and unambiguous, but the program executions show (with "$"s) that this is being executed as non-root.

So... is this deadlock fundamentally related to the speculative execution glitch(es)?

  • userbinator 8 years ago

    That makes me curious; I wonder what on a modern PC relies on periodic interrupts, and would lock up the machine if they didn't occur. I know on the original PC and XT, DRAM refresh relied on interrupts but that stopped being the case with the AT:

    https://www.reenigne.org/blog/how-to-get-away-with-disabling...

    • gizmo686 8 years ago

      Isn't preemption still interupt based. Without interrupts, there would be nothing to cause the CPU to stop executing the demo code. With SMP disabled, this means that nothing else will get a chance to run until the demo code yields itself (or re-enables interupts).

  • geogriffin 8 years ago

    You wouldn't be able to disable interrupts as non-root. The iopl syscall allows the PoC to use CLI to disable interrupts. See the "sudo" in the NO_INTERRUPTS runs:

      $ gcc -o test test.c -Wall -DHIT_THRESHOLD=50 -DNO_INTERRUPTS
      $ sudo ./test
    
    I would guess the deadlock is due to a hardware watchdog timer rebooting the system, or some other hardware function that needs to be tended to periodically before it hangs.
    • pdkl95 8 years ago

      It doesn't look like a deadlock; turning off interrupts prevents the preemptive scheduler from running. Without a timer interrupt, the only way the scheduler would run is if it's invoked to put the process to sleep during a blocking syscall or explicitly with sched_yield(2), pthread_yield(3), etc.

      If interrupts are off, the the PoC program might wait forever for "hits > 32" if never testfun() never detects a "hit". Giving up after 1M bust loops ("cycles < 1000000") should prevent this from happening... but... I wonder...

          gcc -o test test.c -Wall -DHIT_THRESHOLD=50 -DNO_INTERRUPTS
      
      Without -O0, some optimizations are still enabled. Could a modern "clever" optimizing compiler assume that the speculative "hit" never happens and therefor conclude that "cycles" is only used after the loop when it is "guaranteed" to have the value 1000000 and "optimize" the loop into something like

          /*long cycles = 0;*/ //DEAD
          while (hits < 32 /*&& cycles < 1000000*/) { //DEAD
              // ... rest of loop body, maybe?
              /* cycles++; */ //DEAD
              pipeline_flush();
          /*}*/ //DEAD
      
      and the sprintf() into something like:

          sprintf(out_, "%c: %s in 100000 cycles (hitrate: %f%%)\n",
              secret_read_area[idx], results,  100*hits/(double)(100000));
      
      I'm probably worrying about nothing. Or at lest I should be worrying about nothing, but with the current trend of "clever" optimizers exploiting everything they think is provable, I'm no longer certain. bleh
      • caf 8 years ago

        The pipeline_flush() asm block has a "memory" clobber which will certainly prevent this kind of optimisation.

    • exikyut 8 years ago

      Woops! Completely missed that, heh.

      There go my plans for non-root system lockup :(

  • mrob 8 years ago

    Locking up the machine is a possibility when you disable interrupts. Disabling interrupts with the cli instruction needs the IO privilege level (IOPL) to be at least as high as the protection ring the code is running in. Linux runs userspace code in ring 3, so the IOPL has to be set to 3 with the iopl call first. This requires the CAP_SYS_RAWIO capability, which allows you to do pretty much anything already.

rbanffy 8 years ago

The MS advisory: https://portal.msrc.microsoft.com/en-US/security-guidance/ad...

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection