Reverse Engineering 2

講師:ItisCaleb

File format

ELF

 為 Executable and Linkable Format 的縮寫

是 Linux 的可執行程式的格式

一個 ELF 可以分成

  • header
  • section

ELF

ELF header

ELF header 會包括一些 metadata 之類的東西

  • big/little endian
  • 處理器架構
  • 記憶體區段
  • etc...

ELF segment

segment 則是包括實際的資料本身

  • 程式碼
  • 全域變數

ELF segment

常見的區段

  • .text 程式碼
  • .bss 未初始化全域變數
  • .data 已初始化全域變數
  • .rodata 已初始化唯獨全域變數

ELF

可使用 readelf 指令來觀察

readelf -a <your elf>

Memory segments

loader 的其中一個作用就是將檔案中的 segments 放到記憶體中

 

Memory segments

text
data
bss
heap
free space
free space
stack
environments

low address

high address

PE

全名為 Portable Executable

為 Windows 的程式所使用的檔案格式

一般常見的 .exe 皆是 PE 格式

PE

與 ELF 類似,同樣擁有 header 及 sections

不過多了非常多的 header 跟 section

System call

System Call

Linux 的程式若要與作業系統交互(檔案系統、網路、執行程式)便需要使用 system call

System Call

System Call

System Call(x86)

Linux 會透過暫存器的內容來呼叫不同的 system call

  • syscall(0x80)
  • rax、eax 要執行的 syscall 及回傳值
  • rdi rsi rdx r10 r8 r9 x86-64的參數
  • ebx ecx edx esi edi edp x86的參數

System Call(x86)

mov rax, 0x3c
mov rdi, 0
syscall //exit(0)

System Call

Linux 與作業系統交互是使用 system call

而 Windows 則是使用 Windows API

System Call

不知道某個 syscall 或是 WindowsAPI 在幹嘛?

GOOGLE

Calling convention

Calling Convention

在函式呼叫的時候,編譯器會根據不同的指令集架構將參數放到不同的暫存器或是 stack 上面

Calling Convention

  • Linux x86-64
    • rdi rsi rcx rdx r8 r9
       
  • Windows x86-64
    • rcx rdx r8 r9
       
  • x86 都放 stack

Calling Convention

// x86-64
#include <stdio.h>

int sum(int a,int b){
    return a+b;
}

int main(){
    int a,b;
    a = 5;
    b = 4;
    printf("%d",sum(a,b));
    return 0;
}
sum:
    endbr64
    push    rbp
    mov     rbp, rsp
    mov     DWORD PTR -4[rbp], edi
    mov     DWORD PTR -8[rbp], esi
    mov     edx, DWORD PTR -4[rbp]
    mov     eax, DWORD PTR -8[rbp]
    add     eax, edx
    pop     rbp
    ret

Stack frame

Stack frame

程式在執行時,會使用 stack frame 來存放區域變數

每次呼叫函式時,會在記憶體中劃分新的 stack frame,並將區域變數放置其中

Stack frame

<sum>:
	push   rbp
	mov    rbp,rsp
    sub    rsp,0x10
	...
	pop    rbp
	ret

call func = push rip; jmp func;

leave = mov rsp,rbp; pop rbp;

ret = pop rip

<main>:
	push   rbp
	mov    rbp,rsp
	sub    rsp,0x10
	...
	call   1149 <sum>
	...
    leave
	ret

Stack frame

Stack

0x7fffffffe3f8

rsp

<main>:
	push   rbp
	mov    rbp,rsp
	sub    rsp,0x10
	...
	call   1149 <sum>
	...
    leave
	ret

Stack frame

Stack

0x7fffffffe3f8

rsp

rbp 初始值

0x7fffffffe3f0

<main>:
	push   rbp
	mov    rbp,rsp
	sub    rsp,0x10
	...
	call   1149 <sum>
	...
    leave
	ret

Stack frame

Stack

0x7fffffffe3f8

rsp rbp

rbp 初始值

0x7fffffffe3f0

<main>:
	push   rbp
	mov    rbp,rsp
	sub    rsp,0x10
	...
	call   1149 <sum>
	...
    leave
	ret

Stack frame

Stack

0x7fffffffe3f8

rbp

rbp 初始值

0x7fffffffe3f0

rsp

0x7fffffffe3e0

main stack frame

<main>:
	push   rbp
	mov    rbp,rsp
	sub    rsp,0x10
	...
	call   1149 <sum>
	...
    leave
	ret

Stack frame

Stack

0x7fffffffe3f8

rbp

rbp 初始值

0x7fffffffe3f0

rsp

0x7fffffffe3e0

main stack frame

0x55555555518a

call func = push rip
            jmp  func
<main>:
	push   rbp
	mov    rbp,rsp
	sub    rsp,0x10
	...
	call   1149 <sum>
	...
    leave
	ret

Stack frame

Stack

0x7fffffffe3f8

rbp

rbp 初始值

0x7fffffffe3f0

rsp

0x7fffffffe3e0

main stack frame

0x55555555518a

0x7fffffffe3d8

<sum>:
	push   rbp
	mov    rbp,rsp
    sub    rsp,0x10
	...
	leave
	ret

Stack frame

Stack

0x7fffffffe3f8

rbp

rbp 初始值

0x7fffffffe3f0

rsp

0x7fffffffe3e0

main stack frame

0x55555555518a

0x7fffffffe3d8

0x7fffffffe3d0

0x7fffffffe3f0 (saved rbp)

<sum>:
	push   rbp
	mov    rbp,rsp
    sub    rsp,0x10
	...
	leave
	ret

Stack frame

Stack

0x7fffffffe3f8

rbp 初始值

0x7fffffffe3f0

rsp rbp

0x7fffffffe3e0

main stack frame

0x55555555518a

0x7fffffffe3d8

0x7fffffffe3d0

0x7fffffffe3f0 (saved rbp)

<sum>:
	push   rbp
	mov    rbp,rsp
    sub    rsp,0x10
	...
	leave
	ret

Stack frame

Stack

0x7fffffffe3f8

rbp 初始值

0x7fffffffe3f0

rsp

0x7fffffffe3e0

main stack frame

0x55555555518a

<sum>:
	push   rbp
	mov    rbp,rsp
    sub    rsp,0x10
	...
	leave
	ret

0x7fffffffe3d8

0x7fffffffe3f0 (saved rbp)

0x7fffffffe3d0

rbp

0x7fffffffe3c0

sum stack frame

Stack frame

Stack

0x7fffffffe3f8

rbp 初始值

0x7fffffffe3f0

rsp

0x7fffffffe3e0

main stack frame

0x55555555518a

<sum>:
	push   rbp
	mov    rbp,rsp
    sub    rsp,0x10
	...
	leave
	ret

0x7fffffffe3d8

0x7fffffffe3f0 (saved rbp)

0x7fffffffe3d0

rbp

0x7fffffffe3c0

sum stack frame

leave = mov rsp,rbp
        pop rbp

Stack frame

Stack

0x7fffffffe3f8

rbp 初始值

0x7fffffffe3f0

0x7fffffffe3e0

main stack frame

0x55555555518a

<sum>:
	push   rbp
	mov    rbp,rsp
    sub    rsp,0x10
	...
	leave
	ret

0x7fffffffe3d8

rsp

ret = pop rip

rbp

Stack frame

Stack

0x7fffffffe3f8

rbp 初始值

0x7fffffffe3f0

0x7fffffffe3e0

main stack frame

rsp

rbp

<main>:
	push   rbp
	mov    rbp,rsp
	sub    rsp,0x10
	...
	call   1149 <sum>
	...
    leave
	ret

Stack frame

Stack

0x7fffffffe3f8

rbp 初始值

0x7fffffffe3f0

0x7fffffffe3e0

main stack frame

rsp

rbp

<main>:
	push   rbp
	mov    rbp,rsp
	sub    rsp,0x10
	...
	call   1149 <sum>
	...
    leave
	ret

Stack frame

Stack

0x7fffffffe3f8

rsp

<main>:
	push   rbp
	mov    rbp,rsp
	sub    rsp,0x10
	...
	call   1149 <sum>
	...
    leave
	ret

Endian and Struct

Endian

一般來說資料存放的方式有兩種,Big endian 及 Little endian

7f ff ff ab

0x7fffffab (4 bytes)

ab ff ff 7f

Big Endian

Little Endian

Endian

雖然 Big endian 比較容易閱讀,但 Little endian 在程式中比較常見

它可以使程式在取低位時不需要改變記憶體位置

Big endian 則是會出現在網路傳輸之中

ab ff ff 7f

int a = 0x7fffffab

short b = (short) 0x7fffffab 

ab ff ff 7f

Struct

struct list{
	struct list *prev, *next;
	int value;
};
prev next
value

0x7fffffff0000

0x7fffffff0010

0

3

7

b

7

f

f

7

0

8 bytes

4 bytes

not used

expected size: 16 + 4 = 20 bytes

actual size: 24 bytes

Struct

struct list{
	struct list *prev, *next;
	int value;
};

expected size: 16 + 4 = 20 bytes

actual size: 24 bytes

(gdb) b main
(gdb) r
(gdb) print sizeof(struct list)
$1 = 24

Reverse Engineering 2

By ItisCaleb (Caleb)

Reverse Engineering 2

  • 82