How (Not) to
Measure Quality

 

in Software Development

@MichaKutz

@mkutz@mstdn.social

Janina Nemec

@IsItArtOrTrash

@IsItArtOrTrash@digitalcourage.social

Why Measure Quality?

Why do we need to talk about this?

add new feature

improve quality

Goal:

Make better informed decisions about quality

What Could Possibly Go Wrong?

Goodhart's Law

When a measure becomes a target, it ceases to be a good measure

Negative Impact on Motivation/Collaboration

How to Find Metrics?

Goal → Question → Metric

Are we going in the right direction?

Where are we?

Which Quality?

Outer Quality

Inner Quality

Process Quality

Outer Quality

How defective is the product?

Do users like the product?

How useful is the product?

Number of Found Bugs

in Staging vs Production

How defective is the product?

How effective is the test process?

Broken Service Level Objectives

How defective is the product?

How defective is the product?

Do users like the product?

How useful is the product?

Number of customer service complaints/contacts

Do users like the product?

How many people complaining?

User Surveys or Platform Ratings

Do users like the product?

How defective is the product?

Do users like the product?

How useful is the product?

User Experience Tests

How useful is the product?

User Tracking

How useful is the product?

How defective is the product?

Do users like the product?

How useful is the product?

Inner Quality

How likely are unintended changes?

How maintainable is the product?

How confident is the team with the product?

Code Coverage

@Test
void strike() {
  var game = new Game();
  var firstRollPins = 10;
  var secondRollPins = 5;
  var thirdRollPins = 3;
  game.roll(firstRollPins);
  game.roll(secondRollPins);
  game.roll(thirdRollPins);
  
  var score = game.score();

  assertThat(score)
    .isEqualTo(
      firstRollPins +
      (secondRollPins + thirdRollPins) * 2);
}
public int score() {
  int score = firstRoll + secondRoll;
  if (previous != null) {
    if (previous.strike) {
      score *= 2;
    } else if (previous.spare) {
      score += firstRoll;
    }
  }
  return score;
}
@Test
void strike() {
  var game = new Game();
  var firstRollPins = 10;
  var secondRollPins = 5;
  var thirdRollPins = 3;
  game.roll(firstRollPins);
  game.roll(secondRollPins);
  game.roll(thirdRollPins);
  
  var score = game.score();

  // assertThat(score)
  //  .isEqualTo(
  //    firstRollPins +
  //    (secondRollPins + thirdRollPins) * 2);
}

How likely are unintended changes?

How much code gets/doesn't executed by tests?

Mutation Testing: Surviving Mutantations

@Test
void strike() {
  var game = new Game();
  var firstRollPins = 10;
  var secondRollPins = 5;
  var thirdRollPins = 3;
  game.roll(firstRollPins);
  game.roll(secondRollPins);
  game.roll(thirdRollPins);
  
  var score = game.score();

  assertThat(score
    .isEqualTo(
      firstRollPins +
      (secondRollPins + thirdRollPins) * 2);
}
public int score() {
  int score = firstRoll + secondRoll;
  if (previous != null) {
    if (previous.strike) {
      score *= 2;
    } else if (previous.spare) {
      score += firstRoll;
    }
  }
  return score;
}
AssertionFailedError:
expected: 26
 but was: 14

How likely are unintended changes?

How likely are unintended changes?

How maintainable is the product?

How confident is the team with the product?

Team Surveys

How confident is the team with the product?

How effective can you work with the code?

How confident are you to deploy to production?

What would you need to improve the above answers?

How likely are unintended changes?

How maintainable is the product?

How confident is the team with the product?

Static Code Analysis: Complexity

public int score() {
  return firstRoll + secondRoll;
}
public int score() {
  int score = firstRoll + secondRoll;
  if (previous != null && previous.spare) {
    score += firstRoll;
  }
  return score;
}
public int score() {
  int score = firstRoll + secondRoll;
  if (previous != null) {
    if (previous.strike) {
      score *= 2;
    } else if (previous.spare) {
      score += firstRoll;
    }
  }
  return score;
}
previous
  .spare
score *= 2;
previous
  .strike
score += firstRoll;
previous != null
  && previous.spare
int score = firstRoll
  + secondRoll;
return score;

How maintainable is the product?

Static Code Analysis: Code Smells

# Code Smells

- long methods,
- huge classes,
- many parameters,
- code duplicates,
- methods with complexity > 7,
- …

How maintainable is the product?

How likely are unintended changes?

How maintainable is the product?

How confident is the team with the product?

Process Quality

How fast is the process?

How safe is the process?

Velocity

v = \frac{P_{estimate}}{t_{delivery} - t_{commit1}}

How fast is the process?

How good are our estimations?

Delivery Lead Time

\Delta t_{delivery} = t_{delivery} - t_{commit1}
\Delta t_{lead} = \underbrace{\Delta t_{design} + \Delta t_{validate}}_{\text{hard to measure!}} + \Delta t_{delivery}

How fast is the process?

Batch Size

How fast is the process?

How fast is the process?

How safe is the process?

Change Fail Rate

08:07:23 Deploy basket-service ✓
09:06:11 Migrate checkout-service DB ✓
09:56:54 Deploy checkout-service ✓









08:07:23 Deploy basket-service ✓
09:06:11 Migrate checkout-service DB ✓
09:56:54 Deploy checkout-service ✓
10:19:44 Deploy order-managment-service ✗
10:39:27 Rollback order-managemet-service ✓







08:07:23 Deploy basket-service ✓
09:06:11 Migrate checkout-service DB ✓
09:56:54 Deploy checkout-service ✓
10:19:44 Deploy order-managment-service ✗
10:39:27 Rollback order-managemet-service ✓
11:09:59 Update database cluster ✓
12:27:32 Migrate order-managemet-service DB ✓
13:19:22 Deploy order-managemet-service ✓




08:07:23 Deploy basket-service ✓
09:06:11 Migrate checkout-service DB ✓
09:56:54 Deploy checkout-service ✓
10:19:44 Deploy order-managment-service ✗
10:39:27 Rollback order-managemet-service ✓
11:09:59 Update database cluster ✓
12:27:32 Migrate order-managemet-service DB ✓
13:19:22 Deploy order-managemet-service ✓
14:45:55 Update database-cluster ✗
15:50:49 Update database-cluster ✓


08:07:23 Deploy basket-service ✓
09:06:11 Migrate checkout-service DB ✓
09:56:54 Deploy checkout-service ✓
10:19:44 Deploy order-managment-service ✗
10:39:27 Rollback order-managemet-service ✓
11:09:59 Update database cluster ✓
12:27:32 Migrate order-managemet-service DB ✓
13:19:22 Deploy order-managemet-service ✓
14:45:55 Update database-cluster ✗
15:50:49 Update database-cluster ✓
16:39:11 Deploy product-service ✓
17:44:56 Deploy customer-data-service ✓

How safe is the process?

Mean Time to Restore Service

08:07:23 Deploy basket-service ✓
09:06:11 Migrate checkout-service DB ✓
09:56:54 Deploy checkout-service ✓
10:19:44 Deploy order-managment-service ✗
10:39:27 Rollback order-managemet-service ✓
11:09:59 Update database cluster ✓
12:27:32 Migrate order-managemet-service DB ✓
13:19:22 Deploy order-managemet-service ✓
14:45:55 Update database-cluster ✗
15:50:49 Update database-cluster ✓
16:39:11 Deploy product-service ✓
17:44:56 Deploy customer-data-service ✓

How safe is the process?

How fast is the process?

How safe is the process?

Delivery Lead Time

Batch Size

Change Fail Rate

Mean Time to Restore Service

SLOs

Code
Complexity

Mutation Testing

Team Surveys

Customer Service

User Tracking

UX Tests

Change Fail Rate

Lead Time

Deployment Frequency

Time to Restore Service

Code
Smells

Outer Quality

Inner Quality

Process Quality

@MichaKutz

@mkutz@mstdn.social

Not everything that counts can be counted,
and not everything that can be counted counts.

@MichaKutz

@mkutz@mstdn.social

How (Not) to Measure Quality

By Michael Kutz

How (Not) to Measure Quality

Measuring quality is hard, defining what quality is is difficult, being aware of why you measure is fundamentally important. Learn how to choose and combine metrics to create something valuable.

  • 1,019