Untangling Composite Commits Using Program Slicing
Mulaert et al.
Background
Recommendation
* Commits should consist of changes related to a single task i.e. they should be atomic
* easier to revert changes
* find/recover from bugs
In Practice
* 15% of bug fixing commits contain unrelated changes[1]
* 17-29% commits are composite i.e. contain multiple changes[2]
[1] Herzig et al
[2] Tao et al
Background
- makes code review difficult
- changes become difficult to integrate
- specifically as a researcher, these commits complect your analysis
Solution: use tools to "untangle" commit
This paper:
What are some problems posed by composite commits?
- a technique to determine whether commits contain a single task using program slicing and change distilling
- a first step to solving this problem
Trees and Diffs
string hello = "sup";
VariableDeclaration
Target (hello)
StringLiteral("sup")
Tree Diffing:
Compute the "diff" between two trees and categorize each change as insert, update, move or delete
Abstract Syntax Tree
Source Code
Type(string)
NameExpr(hello)
Approach
1. Calculate tree diff
2. Compute static slice for each change
3. Two changes c_i and c_j, are related iff:
- find original location of the node, before the commit
- now compute backward slice of this node
Research Questions
RQ 1: Does the proposed technique correctly identify composite commits?
RQ 2: Does the proposed technique correctly identify single task within a commit?
Evaluation
- 5 open source projects from an already available data set
- In the data set each commit had a label for its associated number of tasks
- Manually validated 388
- Human evaluation of 31 randomly sampled examples of tool output
RQ1 Results:
RQ 2 Results
- Inconclusive
- Not able to correctly identify the number of tasks in a composite commits
- however, could be due to difference in granularity
- but the manual validation did not account for this
Survey Results
Critique
+ Novel approach
+ Replication package
- Usage of TinyPDG and Change-Distiller
+ Good first step to solving a relevant problem
- No comparison to state of the art approaches
- Not the best written paper
- Evaluation for RQ2 weak
Questions?
Untangling Composite Commits Using Program Slicing
By Devjeet Roy
Untangling Composite Commits Using Program Slicing
- 580