Software reverse engineering
with graph theory

Content

 

  • What is reverse engineering?
  • Motivations
  • Comparing binaries using graph theory
  • Credits

<< Compiling in Java

What is reverse engineering?

"Reverse engineering is the opposite

of compiling"

Convert machine code that is

closer to binary

#include <stdio.h>
int main() 
{
    printf("hello, world");

    struct person {
        int age, salary;
        DEPT department;
        char name[12];
        char address[6][20];
    };
    return 0; 
};




Source code

to source code that is human-readable

Motivations

 

 

  1. Can the software be trusted?
  2. ​Stuxnet
  • Took more than a year to be discovered
  • Took experts half a year to decipher the payload!

You can't trust code that you did not totally create yourself.

(Especially code from companies that employ people like me.)

- Ken Thompson, Turing award winner 1983

Comparing binaries using graph theory

Opportunities

  • Programmers are lazy
  • Upsurge in open source code usage
  • Programs have their own unique control-flow
#include <stdio.h>
int main() 
{
    printf("hello, world");
    boolean diet = false;

    if(a == false)
    {
        printf("%s\n", "eat pizza");
    }
    else
    {
        printf("%s\n", "no eating!");
    }

    return 0; 
};



example C program
#include <stdio.h>
int main() 
{
    printf("hello, world");
    boolean diet = false;

    if(diet == false)
    {
        printf("%s\n", "eat pizza");
    }
    else
    {
        printf("%s\n", "no eating!");
    }

    return 0; 
};



organizing code into basic blocks

we use control-flow as the program's signature 

>>

We extract sub-graphs of size k

>>

>>

>>   1214       >>      XvxFGF

>>   1286        >>     baNUAL

1    2    4    8

algorithm

algorithm

Problems solved

  • Quick identification of malware and viruses
     
  • Prevents illegal code reuse

this presentation is inspired by:

Credits

Thank you

Yixuan, U san, thanks!

Made with Slides.com