DeepTC-Enhancer

Devjeet Roy

Ziyi Zhang

Maggie Ma

Venera Arnaoudova

Sebastiano Panichella

Annibale Panichella

Danielle Gonzalez

Mehdi Mirakhorli

Improving the Readability of Automatically Generated Tests

Testing in Software Development

Manually written tests are resource intensive. Developers spend 25% of their time writing tests.

Tests are an essential part of SE.

First step towards quality control

Testing in Software Development

Researchers have proposed several tools to automatically generate tests.

Automatically generated tests have been shown to be:

  • Effective at bug detection
  • Complementary to manually written tests

Problem solved?

Well, not completely...

An Example

@Test(timeout = 4000)
public void test040 throws Throwable {
    KeycloakUriBuilder keycloakUriBuilder0 = KeycloakUriBuilder.fromPath("x");
    HashMap<String, Integer> hashMap0 = new HashMap<String, Integer>();
    URI uRI0 = keycloakUriBuilder0.buildFromEncodedMap(hashMap0);
    assertEquals("x", uRI0.getRawPath());
}

No documentation

Meaningless identifier names

An Example

@Test(timeout = 4000)
public void test040 throws Throwable {
    KeycloakUriBuilder keycloakUriBuilder0 = KeycloakUriBuilder.fromPath("x");
    HashMap<String, Integer> hashMap0 = new HashMap<String, Integer>();
    URI uRI0 = keycloakUriBuilder0.buildFromEncodedMap(hashMap0);
    assertEquals("x", uRI0.getRawPath());
}

Automatically generated tests often have poor readability

An Example

@Test(timeout = 4000)
public void test040 throws Throwable {
    KeycloakUriBuilder keycloakUriBuilder0 = KeycloakUriBuilder.fromPath("x");
    HashMap<String, Integer> hashMap0 = new HashMap<String, Integer>();
    URI uRI0 = keycloakUriBuilder0.buildFromEncodedMap(hashMap0);
    assertEquals("x", uRI0.getRawPath());
}

How can we improve it?

An Example


/**
 * 1. Creates a new KeyCloakUriBuilder "uri" from path
 * 2. Creates a new HashMap and uses it to create a new URI "result" using
 *    method "buildFromEncodedMap" of "uri"
 * 3. Checks if the raw path of "result" equals "x"
 */
@Test(timeout = 4000)
public void test040 throws Throwable {
    KeycloakUriBuilder keycloakUriBuilder0 = KeycloakUriBuilder.fromPath("x");
    HashMap<String, Integer> hashMap0 = new HashMap<String, Integer>();
    URI uRI0 = keycloakUriBuilder0.buildFromEncodedMap(hashMap0);
    assertEquals("x", uRI0.getRawPath());
}

Add a summary that tells us what the test is doing

An Example


/**
 * 1. Creates a new KeyCloakUriBuilder "uri" from path
 * 2. Creates a new HashMap and uses it to create a new URI "result" using
 *    method "buildFromEncodedMap" of "uri"
 * 3. Checks if the raw path of "result" equals "x"
 */
@Test(timeout = 4000)
public void testEncodedPath throws Throwable {
    KeycloakUriBuilder keycloakUriBuilder0 = KeycloakUriBuilder.fromPath("x");
    HashMap<String, Integer> hashMap0 = new HashMap<String, Integer>();
    URI uRI0 = keycloakUriBuilder0.buildFromEncodedMap(hashMap0);
    assertEquals("x", uRI0.getRawPath());
}

Generate meaningful test case name...

An Example


/**
 * 1. Creates a new KeyCloakUriBuilder "uri" from path
 * 2. Creates a new HashMap and uses it to create a new URI "result" using
 *    method "buildFromEncodedMap" of "uri"
 * 3. Checks if the raw path of "result" equals "x"
 */
@Test(timeout = 4000)
public void testEncodedPath throws Throwable {
    KeycloakUriBuilder uri = KeycloakUriBuilder.fromPath("x");
    HashMap<String, Integer> map = new HashMap<String, Integer>();
    URI result = uri.buildFromEncodedMap(map);
    assertEquals("x", result.getRawPath());
}

... and meaningful variable names

DeepTC-Enhancer

Goal

Comprehensively improve the readability of automatically generated test cases.

Approach

+

High Level Method Summaries

1

Identifier Renaming 

(Method Name + Variable Names)

2

Existing Work

Panichella et al. proposed TestDescriber

Generates detailed test case summaries for automatically generated unit tests

TestDescriber Example

	/**
	 * OVERVIEW: The test case "test11" covers around 2.0% (low percentage) of
	 * statements in "ArrayIntList"
	 **/
	public void test11() throws Throwable {
		// The test case instantiates a "ArrayIntList" with the default
		// configuration (initial capacity is 8)
		ArrayIntList arrayIntList0 = new ArrayIntList();
		// The next method call trim to size of "arrayIntList0"
		// The execution of this method call implicitly covers the following 1
		// conditions:
		// - the condition "the size is less than data.length" is TRUE;
		arrayIntList0.trimToSize();
		// The next method call trim to size of "arrayIntList0"
		// The execution of this method call implicitly covers the following 1
		// conditions:
		// - the condition "the size is less than data.length" is TRUE;
		arrayIntList0.trimToSize();
		// Then, it tests:
		// 1) whether the size of "arrayintlist0" is equal to 0;
		assertEquals(0, arrayIntList0.size());
	}

Existing Work

Daka et al. proposed a technique to automatically generate descriptive names for automatically generated tests.

Daka et al Name Generation Example

@Test
public void test0 / testAddPriceReturningFalse() {
  ShoppingCart cart0 = new ShoppingCart();
  boolean boolean0 = cart0.addPrice(2298);
  assertEquals(0, cart0.getTotal());
  assertFalse(boolean0);
}

original

generated

DeepTC-Enhancer

Goal

Comprehensively improve the readability of automatically generated test cases.

Approach

+

High Level Method Summaries

1

Identifier Renaming 

(Method Name + Variable Names)

2

Method Level Summaries: Key Considerations

Aid developers in navigating automatically generated test suites


/**
 * 1. Creates a new KeyCloakUriBuilder "uri" from path
 * 2. Creates a new HashMap and uses it to create a new URI "result" using
 *    method "buildFromEncodedMap" of "uri"
 * 3. Checks if the raw path of "result" equals "x"
 */
@Test(timeout = 4000)
public void testEncodedPath throws Throwable {
    KeycloakUriBuilder uri = KeycloakUriBuilder.fromPath("x");
    HashMap<String, Integer> map = new HashMap<String, Integer>();
    URI result = uri.buildFromEncodedMap(map);
    assertEquals("x", result.getRawPath());
}

What are test case scenarios?


/**
 * 1. Creates a new KeyCloakUriBuilder "uri" from path
 * 2. Creates a new HashMap and uses it to create a new URI "result" using
 *    method "buildFromEncodedMap" of "uri"
 * 3. Checks if the raw path of "result" equals "x"
 */
@Test(timeout = 4000)
public void testEncodedPath throws Throwable {
    KeycloakUriBuilder uri = KeycloakUriBuilder.fromPath("x");
    HashMap<String, Integer> map = new HashMap<String, Integer>();
    URI result = uri.buildFromEncodedMap(map);
    assertEquals("x", result.getRawPath());
}

The series of steps that takes place in a unit test

What are test case scenarios?

Aid developers in navigating automatically generated test suites

Must be faster to parse through than actual source code

Test Case Scenarios

Method Level Summaries: Key Considerations

Method Level Summaries: Key Considerations

Remove irrelevant details

Calls method x of object y with arguments a, b and c. 

Calls method x of y

Aid developers in navigating automatically generated test suites

Test Case Scenarios

Method Level Summaries: Key Considerations

Aggregate statements

Creates a new TCPConnection and checks its port.

1. Creates a new TCPConnection "connection"

2. Checks the port of "connection" 

Aid developers in navigating automatically generated test suites

Test Case Scenarios

Detail Reduction + Statement Aggregation

Aid developers in navigating automatically generated test suites

Must be faster to parse through than actual source code

Test Case Scenarios

Method Level Summaries: Key Considerations

Identifier Renaming

Existing deep learning technique used for extreme code summarization by Alon et al, known as code2seq

Utilizes a path based representation of source code

Bi-Directional LSTM with attention, as proposed by Bahdanu et al.

code2seq example

Key Adaptations

Train on human written unit test cases collected from open source projects

Use only engineered projects for model training

Modified to use subword embeddings instead of subtoken embeddings to reduce vocabulary size and OOV problem.

Mask all method/variable names because their names in automatically generated tests are meaningless

One model each for variable name generation and test case name generation.

Evaluation

Human evaluation with 30 external and 6 internal developers

Evaluated using common criteria (conciseness, content adequacy etc) on a likert scale

Sample Question from Survey (Variable Name Evaluation)

Evaluation: DeepTC-Enhancer vs TestDescriber

DeepTC-Enhancer performs significantly better than TestDescriber in terms of readabality (p=0.01) and conciseness (p=0.02)

However, no statistical significant difference in terms of content adequacy

/**
 * OVERVIEW: The test case "test3" covers around 6.0% (low percentage) of
 * statements in "Rational"
 **/
@Test
public void test3() throws Throwable {
    // The test case instantiates a "Rational" with numerator equal to 1L,
    // and denominator equal to 3215L.
    // The execution of this constructor implicitly covers the following 1
    // conditions:
    // - the condition " denominator equals to 0L" is FALSE;
    Rational rational0 = new Rational(1L, 3215L);
    // The test case declares an object of the class "Rational" whose value
    // is equal to the absolute value of "rational0"
    Rational rational1 = rational0.abs();
    // Then, it tests:
    // 1) whether the numerator of rational0 is equal to 1L;
    assertEquals(1L, rational0.numerator);
    // 2) whether the denominator of rational0 is equal to 3215L;
    assertEquals(3215L, rational0.denominator);
    // 2) whether the float value of "rational1" is equal to 3.11041E-4F
    // with delta equal to 0.01F;
    assertEquals(3.11041E-4F, rational1.floatValue(), 0.01F);
}

TestDescriber

/**
 * 1. Creates a 2 Rational objects, "rational0" and "rational1"
 * 2. Checks the numerator and denominator of "rational0" and the
 *    float value of "rational1"
 **/
@Test
public void test3() throws Throwable {
    Rational rational0 = new Rational(1L, 3215L);
    Rational rational1 = rational0.abs();
    assertEquals(1L, rational0.numerator);
    assertEquals(3215L, rational0.denominator);
    assertEquals(3.11041E-4F, rational1.floatValue(), 0.01F);
}

DeepTC-Enhancer

Evaluation: DeepTC-Enhancer vs Daka et al's approach

However, no statistical significance to this difference.

  @Test
  public void test13()  throws Throwable  {
      ClassWriter classWriter0 = new ClassWriter((-18));
      classWriter0.visitAnnotation("", false);
      classWriter0.toByteArray();
      assertEquals(3, classWriter0.index);
  }

Daka et al:

testVisitAnnotationWithNonEmptyStringAndFalse

DeepTC-Enhancer:

testVisitAnnotation

Readability

Intent

Evaluation: Variable Renaming

  • Respondents rated the variable names generated by DeepTC-Enhancer to capture intent 83% of the time

Evaluation: DeepTC-Enhancer and Readability

  • 80% of external developers reported that the enhancements performed by DeepTC-Enhancer resulted in an increase in readability
    • 43% reported a significant improvement
  • All internal developers reported an increase in readability with the increase being significant for 4/6.
  • 73% of the developers indicated that they were likely to use DeepTC-Enhancer when using automatically generated tests

Evaluation: Feature Usefulness

Most useful features:

  • External Developers - Test Case Scenarios
  • Internal Developers - Variable Renaming
  • External developers appreciate the high level summaries due to their unfamiliarity with the source code.
  • Internal developers reported that the generated summaries did not fit the documentation guidelines for their project

Key Results

Individual aspects of DeepTC-Enhancer

test case names - no significant difference

DeepTC-Enhancer improves the readability of automatically generated test cases

External developers find the test case summaries as the most useful feature of our approach, while internal developers find variable renaming as the most useful feature

test case summaries - significantly better

variable names - no baselines

Future Work

Customizable documentation style for method level summaries

Task specific and larger scale evaluation

Questions?

Technique Overview