(Code Similarity Check)

GROUP MEMBERS:

Ali Ghulam - - - - - - - - - - - - - - (P17-6009)

Faizan Ahmad - - - - - - - - - - -  (P17-6020)

Muhammad Hafeez Ullah - -  (P17-6144)

SUPERVISOR:

Shoaib Muhammad Khan

Assistant Professor

FAST NUCES, Peshawar Campus

slides.com/faizanf33/code-similarity-check-03

Distance Metric for Source code 
Abstract Syntax trees

Introduction

  • Replicating or altering code (immorality). 

 

  • Student's coding ability drops.

 

  • The original creator of source code? (Reward)

 

  • Find similarities in one specific language.

Objective

  • Automate Code Plagiarism Check for teachers/instructors.
    

 

  • Advance students’ coding aptitudes by weakening duplicated code.

 

  • Improve research in the code analysis area.

Workflow

High Level Code

Assembly Code

Abstract Syntax Tree

Similarity Index

Report

Project Prototype Demonstration

Efficiency 

  • Exact formation of AST through assembly code.
 
  • Compiled vs Interpreted 

Technical Difficulties

  • Difficulties in recognition of an immediate parent of a child.

 

  • Translation/Conversion into Assembly Language.

Goals Achieved

  • Conversion to assembly code.

 

  • General Tree (Implementation).

 

  • Generate Abstract Syntax Tree.

Plans For FYP-2

  • Comparison between abstract trees.
    

 

  • Find Similarity Index.

 

  • Generate report.

References

Salazar Paredes, Pedro. Comparing python programs using abstract syntax trees. BS thesis. Uniandes, 2020.

[2]

Bahareh Bafandeh Mayvan, Abbas Rasoolzadegan, Design pattern detection based on the graph theory, Knowledge-Based Systems (2017)

[3]

Schleimer, Saul, Daniel S. Wilkerson, and Alex Aiken. "Winnowing: local algorithms for document fingerprinting." Proceedings of the 2003 ACM SIGMOD international conference on Management of data. 2003.

[1]

Dokmanic, Ivan, et al. "Euclidean distance matrices: essential theory, algorithms, and applications." IEEE Signal Processing Magazine 32.6 (2015): 12-30.

[5]

Maletic, Jonathan I., and Andrian Marcus. "Using latent semantic analysis to identify similarities in source code to support program understanding." Proceedings 12th IEEE internationals conference on tools with artificial intelligence. ICTAI 2000. IEEE, 2000.

[4]

slides.com/faizanf33/code-similarity-check-03

Thank you for your precious time.

Any Suggestions?

Code Similarity Check - III

By Faizan Ahmad

Code Similarity Check - III

Using graph theory and program disassembly to create abstract syntax trees from code. These will be used to generate similarity reports for student code submissions in different languages including Python, Java, and C++.

  • 111