Reverse Engineering 2
講師:ItisCaleb
File format
ELF
為 Executable and Linkable Format 的縮寫
是 Linux 的可執行程式的格式
一個 ELF 可以分成
- header
- section
ELF
ELF header
ELF header 會包括一些 metadata 之類的東西
- big/little endian
- 處理器架構
- 記憶體區段
- etc...
ELF segment
segment 則是包括實際的資料本身
- 程式碼
- 全域變數
ELF segment
常見的區段
- .text 程式碼
- .bss 未初始化全域變數
- .data 已初始化全域變數
- .rodata 已初始化唯獨全域變數
ELF
可使用 readelf 指令來觀察
readelf -a <your elf>
Memory segments
loader 的其中一個作用就是將檔案中的 segments 放到記憶體中
Memory segments
text |
data |
bss |
heap |
free space |
free space |
stack |
environments |
low address
high address
PE
全名為 Portable Executable
為 Windows 的程式所使用的檔案格式
一般常見的 .exe 皆是 PE 格式
PE
與 ELF 類似,同樣擁有 header 及 sections
不過多了非常多的 header 跟 section
System call
System Call
Linux 的程式若要與作業系統交互(檔案系統、網路、執行程式)便需要使用 system call
System Call
System Call
System Call(x86)
Linux 會透過暫存器的內容來呼叫不同的 system call
- syscall(0x80)
- rax、eax 要執行的 syscall 及回傳值
- rdi rsi rdx r10 r8 r9 x86-64的參數
- ebx ecx edx esi edi edp x86的參數
System Call(x86)
mov rax, 0x3c
mov rdi, 0
syscall //exit(0)
System Call
Linux 與作業系統交互是使用 system call
而 Windows 則是使用 Windows API
System Call
不知道某個 syscall 或是 WindowsAPI 在幹嘛?
Calling convention
Calling Convention
在函式呼叫的時候,編譯器會根據不同的指令集架構將參數放到不同的暫存器或是 stack 上面
Calling Convention
- Linux x86-64
- rdi rsi rcx rdx r8 r9
- rdi rsi rcx rdx r8 r9
- Windows x86-64
- rcx rdx r8 r9
- rcx rdx r8 r9
- x86 都放 stack
Calling Convention
// x86-64
#include <stdio.h>
int sum(int a,int b){
return a+b;
}
int main(){
int a,b;
a = 5;
b = 4;
printf("%d",sum(a,b));
return 0;
}
sum:
endbr64
push rbp
mov rbp, rsp
mov DWORD PTR -4[rbp], edi
mov DWORD PTR -8[rbp], esi
mov edx, DWORD PTR -4[rbp]
mov eax, DWORD PTR -8[rbp]
add eax, edx
pop rbp
ret
Stack frame
Stack frame
程式在執行時,會使用 stack frame 來存放區域變數
每次呼叫函式時,會在記憶體中劃分新的 stack frame,並將區域變數放置其中
Stack frame
<sum>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
pop rbp
ret
call func = push rip; jmp func;
leave = mov rsp,rbp; pop rbp;
ret = pop rip
<main>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
call 1149 <sum>
...
leave
ret
Stack frame
Stack
0x7fffffffe3f8
rsp
<main>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
call 1149 <sum>
...
leave
ret
Stack frame
Stack
0x7fffffffe3f8
rsp
rbp 初始值
0x7fffffffe3f0
<main>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
call 1149 <sum>
...
leave
ret
Stack frame
Stack
0x7fffffffe3f8
rsp rbp
rbp 初始值
0x7fffffffe3f0
<main>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
call 1149 <sum>
...
leave
ret
Stack frame
Stack
0x7fffffffe3f8
rbp
rbp 初始值
0x7fffffffe3f0
rsp
0x7fffffffe3e0
main stack frame
<main>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
call 1149 <sum>
...
leave
ret
Stack frame
Stack
0x7fffffffe3f8
rbp
rbp 初始值
0x7fffffffe3f0
rsp
0x7fffffffe3e0
main stack frame
0x55555555518a
call func = push rip
jmp func
<main>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
call 1149 <sum>
...
leave
ret
Stack frame
Stack
0x7fffffffe3f8
rbp
rbp 初始值
0x7fffffffe3f0
rsp
0x7fffffffe3e0
main stack frame
0x55555555518a
0x7fffffffe3d8
<sum>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
leave
ret
Stack frame
Stack
0x7fffffffe3f8
rbp
rbp 初始值
0x7fffffffe3f0
rsp
0x7fffffffe3e0
main stack frame
0x55555555518a
0x7fffffffe3d8
0x7fffffffe3d0
0x7fffffffe3f0 (saved rbp)
<sum>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
leave
ret
Stack frame
Stack
0x7fffffffe3f8
rbp 初始值
0x7fffffffe3f0
rsp rbp
0x7fffffffe3e0
main stack frame
0x55555555518a
0x7fffffffe3d8
0x7fffffffe3d0
0x7fffffffe3f0 (saved rbp)
<sum>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
leave
ret
Stack frame
Stack
0x7fffffffe3f8
rbp 初始值
0x7fffffffe3f0
rsp
0x7fffffffe3e0
main stack frame
0x55555555518a
<sum>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
leave
ret
0x7fffffffe3d8
0x7fffffffe3f0 (saved rbp)
0x7fffffffe3d0
rbp
0x7fffffffe3c0
sum stack frame
Stack frame
Stack
0x7fffffffe3f8
rbp 初始值
0x7fffffffe3f0
rsp
0x7fffffffe3e0
main stack frame
0x55555555518a
<sum>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
leave
ret
0x7fffffffe3d8
0x7fffffffe3f0 (saved rbp)
0x7fffffffe3d0
rbp
0x7fffffffe3c0
sum stack frame
leave = mov rsp,rbp
pop rbp
Stack frame
Stack
0x7fffffffe3f8
rbp 初始值
0x7fffffffe3f0
0x7fffffffe3e0
main stack frame
0x55555555518a
<sum>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
leave
ret
0x7fffffffe3d8
rsp
ret = pop rip
rbp
Stack frame
Stack
0x7fffffffe3f8
rbp 初始值
0x7fffffffe3f0
0x7fffffffe3e0
main stack frame
rsp
rbp
<main>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
call 1149 <sum>
...
leave
ret
Stack frame
Stack
0x7fffffffe3f8
rbp 初始值
0x7fffffffe3f0
0x7fffffffe3e0
main stack frame
rsp
rbp
<main>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
call 1149 <sum>
...
leave
ret
Stack frame
Stack
0x7fffffffe3f8
rsp
<main>:
push rbp
mov rbp,rsp
sub rsp,0x10
...
call 1149 <sum>
...
leave
ret
Endian and Struct
Endian
一般來說資料存放的方式有兩種,Big endian 及 Little endian
7f | ff | ff | ab |
---|
0x7fffffab (4 bytes)
ab | ff | ff | 7f |
---|
Big Endian
Little Endian
Endian
雖然 Big endian 比較容易閱讀,但 Little endian 在程式中比較常見
它可以使程式在取低位時不需要改變記憶體位置
Big endian 則是會出現在網路傳輸之中
ab | ff | ff | 7f |
---|
int a = 0x7fffffab
short b = (short) 0x7fffffab
ab | ff | ff | 7f |
---|
Struct
struct list{
struct list *prev, *next;
int value;
};
prev | next |
value |
0x7fffffff0000
0x7fffffff0010
0
3
7
b
7
f
f
7
0
8 bytes
4 bytes
not used |
expected size: 16 + 4 = 20 bytes
actual size: 24 bytes
Struct
struct list{
struct list *prev, *next;
int value;
};
expected size: 16 + 4 = 20 bytes
actual size: 24 bytes
(gdb) b main
(gdb) r
(gdb) print sizeof(struct list)
$1 = 24
Reverse Engineering 2
By ItisCaleb (Caleb)
Reverse Engineering 2
- 61