Code Similarity Check

GROUP MEMBERS:

Ali Ghulam - - - - - - - - - - - - - - (21078969)

SUPERVISOR:

Kufreh Sampson

Assistant Professor

Hertfordshire University 

Introduction

  • Replicating or altering code (immorality).
  • The original creator of source code?
  • Students coding ability drops.
  • Find similarities in different languages.

Motivation

  • MOSS
    (A System for Detecting Software Similarity)

[1]

  •    The analogy between fraud and plagiarism in the context of the Fraud Triangle.
[1]
#include <iostream>
using namespace std;

// Find fibonacci of a number 'n'
int fib(int n) {
    if (n <= 1)
        return n;
    
    return fib(n-1) + fib(n-2);
}
 
int main() {
    int n = 9;
    cout << fib(n) << endl;
    return 0;
}
### Find fibonacci of a number 'n'
def fib(n: int) -> int:
    if (n <= 1):
        return n
    
    return fib(n-1) + fib(n-2)
 

if __name__ == '__main__':
    n = 9
    print(fib(n))

Literature Review

[2]
  • Produces reports on the basis of similarity index.
  • The model can detect code similarity using sub-tree (partial) indexing.
Comparing Python Programs Using Abstract Syntax Trees
[3]
  • Detecting design patterns using a semantic graph.
  • The model can detect similar patterns with high accuracy and efficiency.
Design pattern detection based on the graph theory
[4]
  • Uses hashing technique to generate syntax tree.
  • Efficiently indexes AST representation and reduced false-posisitve collisions.
Syntax tree fingerprinting for source code similarity detection

Problem Statement

  • Generate similarity reports for student code submissions in different languages

Proposed Methodology

  • Disassemble code
  • Generate abstract syntax tree
  • Find similarity index

Objectives

  • Automate Code Plagiarism Check for teachers/instructors.
  • Advance students’ coding aptitudes by weakening duplicated code.
  • Improve research in code analysis area.

Gantt Chart

  • FYP-1:

Gantt Chart

  • FYP-2:

Flow Chart

References

Salazar Paredes, Pedro. Comparing python programs using abstract syntax trees. BS thesis. Uniandes, 2020.

[2]

Bahareh Bafandeh Mayvan, Abbas Rasoolzadegan, Design pattern detection
based on the graph theory, Knowledge-Based Systems (2017)

[3]

Chilowicz, Michel, Etienne Duris, and Gilles Roussel. "Syntax tree fingerprinting for source code similarity detection." 2009 IEEE 17th International Conference on Program Comprehension. IEEE, 2009.

[4]

Thank you for your precious time.

Any Suggestions?

Made with Slides.com