Untangling Composite Commits Using Program Slicing

Mulaert et al.

Background

Recommendation

* Commits should consist of changes related to a single task i.e. they should be atomic

   * easier to revert changes

   * find/recover from bugs

In Practice

* 15% of bug fixing commits contain unrelated changes[1]

* 17-29% commits are composite i.e. contain multiple changes[2]

[1] Herzig et al

[2] Tao et al

Background

  • makes code review difficult
  • changes become difficult to integrate
  • specifically as a researcher, these commits complect your analysis

Solution: use tools to "untangle" commit

This paper:

What are some problems posed by composite commits?

  • a technique to determine whether commits contain a single task using program slicing and change distilling
  • a first step to solving this problem

Trees and Diffs

string hello = "sup";

VariableDeclaration

Target (hello)

StringLiteral("sup")

Tree Diffing:

Compute the "diff" between two trees and categorize each change as insert, update, move or delete

Abstract Syntax Tree

Source Code

Type(string)

NameExpr(hello)

Approach

1. Calculate tree diff

2. Compute static slice for each change

c_i

3. Two changes c_i and c_j, are related iff:

c_i \in S(c_j) \vee c_j \in S(c_i)
c_i
c_j
  • find original location of the node, before the commit
  • now compute backward slice of this node

Research Questions

RQ 1: Does the proposed technique correctly identify composite commits?

RQ 2: Does the proposed technique correctly identify single task within a commit?

Evaluation

  • 5 open source projects from an already available data set
  • In the data set each commit had a label for its associated number of tasks
  • Manually validated 388
  • Human evaluation of 31 randomly sampled examples of tool output

RQ1 Results:

RQ 2 Results

  • Inconclusive
  • Not able to correctly identify the number of tasks in a composite commits
  • however, could be due to difference in granularity
  • but the manual validation did not account for this

Survey Results

Critique

+ Novel approach

+ Replication package

- Usage of TinyPDG and Change-Distiller

+ Good first step to solving a relevant problem

- No comparison to state of the art approaches

- Not the best written paper

- Evaluation for RQ2 weak

Questions?

Untangling Composite Commits Using Program Slicing

By Devjeet Roy

Untangling Composite Commits Using Program Slicing

  • 580