SmotherSpectre
Exploiting Speculative Execution through Port Contention
The 2018 Shock
Meltdown Leaks Kernel Memory
Spectre leaks arbitrary memory outside bounds and even sandboxes
Detour to a quick crash course on Meltdown and Spectre
Genesis
- There are two key steps in Meltdown & Spectre
- Source of the attack (OoO andSpeculative Execution)
- Medium of leak (predominantly Caches)
- Umpteen works on sealing the medium of leakage (Caches, BPU, etc.)
- But how do we know there are only a finite number of mediums? And sealing a medium kills the attack?
- This work shows that there's yet another medium that's the source of leakage
- (What does it tell us?) - It might be more worthwhile to go behind the source than behind the medium
Another disadvantage with Cache side channels
- Consider Spectre V2 - Branch Target injection
- Pollute BTB - speculate target to branch to a gadget code and execute it (gadget code with loads from critical memory)
- But finding a gadget code that gets the data to the cache is very hard except in a few cases such eBPF (Berkeley's Packet Filter)
- Hence, costly solutions like Retpoline (indirect branches don't speculate) is not applied widely
- However, the current work claims to make V2 more practical - since we are not limited to load gadgets alone!
Idea
SMoTher
- The attack is based on the following observation
- Two SMT threads share Execution units
- Instructions that are scheduled to execute on the same port will contend
SMoTher Differentiability
- Let one SMT thread run a set of instructions \( V = \{V0, V1, V2,...\} \)
- Attacker runs a set of instructions A
- If Attacker can infer some \( V_i \in V\) the sequences in V are said to be SMoTher Differentiable
Victim: Either popcnt (port 1) or ror (port 6)
Attacker: Contending and timing port 1
Victim: Either cmovz (port 6) or popcnt (port 1)
Attacker: Contending and timing port 6 using bts
Pitfalls
- Synchronization of attacker and victim is crucial. If the code sequence is short, this is hard to achieve
- Pipeline bottlenecks other than port contention can overshadow the side channel (e.g. RAW hazards)
- The CPU may eliminate the execution of some instructions (zero idioms). This removes contention
- Some instructions (e.g. SSE and AVX) are subject to aggressive power-saving features on modern CPUs. This makes measuring very difficult
SMoTherSpectre
- Leverages BTI to poison BPU making one SMT thread jump to an address upon an indirect jump to execute a code with SMoTher Differentiable code
- Unlike cache side channels, you don't need a measurable change in the micro-architectural state here
- Apparently, SMoTher differentiable gadgets are easy to find unlike cache load gadgets
- libcrypto in OpenSSL has 12,000 such gadgets readily available
Key Assumptions
- To maximize the success rate
- Introduce N taken branches before indirect branch
- Disable ASLR
- Evict cache line containing indirect jump pointer
- Know if BTI worked using performance counter
- BR_MISP_EXEC.TAKEN_INDIRECT_JUMP_NON_CALL_RET
Gadgets
- There are two types of gadgets that's required to make SMoTherSpectre successful i.e. BTI gadget and SMoTher gadget
BTI Gadgets
- Pass a secret through a register to an arbitrary code in the same process
- Ideal BTI targets are virtual function calls in C++ that happens through a vtable and dynamically linked ELF calls that will go through GOT
- If we evict vtable and GOT from cache, it gives around ~200 cycles to reliably mount the attack
SMoTher Gadgets
- The gadget is either a part of the victim or an additional attack vector by the attacker
- The gadget consists of,
- Instruction comparing the secret register to a known value
- Followed by a conditional-flow transfer depending on the comparison
- The instructions should have distinct port fingerprint
- Port fingerprinting is the task of ranking two instruction sequences based on their port utilization
- Instructions should not have other memory references to caches - that will inject noise
Real World OpenSSH attack
This calls bzero function using a volatile pointer
Chosen BTI gadget
This ins checks if 8th bit is set in address location within (rdi) - secret
This code times the addl and checks for port contention with sub
Secret
Mitigations
- Disable SMT entirely (15% overhead on Intel machines)
- OS can employ side-channel aware strategy i.e. colocate only threads from the same user on same SMT threads
- Can explore the utilities of coarse-grained or interleaved Multithreading as viable alternatives for SMT
- Figure out a solution for BTI (such as Retpoline)
- But it should have very low overheads (<Disabling SMT obviously)
- However SMoTherspectre can also employ other Spectre Variants (e.g. RSB overflow) to mount the attack
SmotherSpectre
By Vinod Ganesan
SmotherSpectre
- 173