Software reverse engineering
with graph theory
Content
- What is reverse engineering?
- Motivations
- Comparing binaries using graph theory
- Credits
<< Compiling in Java
What is reverse engineering?
"Reverse engineering is the opposite
of compiling"
Convert machine code that is
closer to binary
#include <stdio.h>
int main()
{
printf("hello, world");
struct person {
int age, salary;
DEPT department;
char name[12];
char address[6][20];
};
return 0;
};
Source code
to source code that is human-readable
Motivations
- Can the software be trusted?
- Stuxnet
- Took more than a year to be discovered
- Took experts half a year to decipher the payload!
You can't trust code that you did not totally create yourself.
(Especially code from companies that employ people like me.)
- Ken Thompson, Turing award winner 1983
Comparing binaries using graph theory
Opportunities
- Programmers are lazy
- Upsurge in open source code usage
- Programs have their own unique control-flow
#include <stdio.h>
int main()
{
printf("hello, world");
boolean diet = false;
if(a == false)
{
printf("%s\n", "eat pizza");
}
else
{
printf("%s\n", "no eating!");
}
return 0;
};
example C program
#include <stdio.h>
int main()
{
printf("hello, world");
boolean diet = false;
if(diet == false)
{
printf("%s\n", "eat pizza");
}
else
{
printf("%s\n", "no eating!");
}
return 0;
};
organizing code into basic blocks
we use control-flow as the program's signature
>>
We extract sub-graphs of size k
>>
>>
>> 1214 >> XvxFGF
>> 1286 >> baNUAL
1 2 4 8
algorithm
algorithm
Problems solved
- Quick identification of malware and viruses
- Prevents illegal code reuse
this presentation is inspired by:
Credits
Thank you
Yixuan, U san, thanks!
computational thinking
By vampure
computational thinking
Show and tell: Share an interesting application of computation thinking in real life
- 655