AI-TDD

How to keep control

Chi sono?

Technical Coach / Founder (at)Mirai Training

Host (at) DevDojo IT

Produttore di bug da ~25 anni

Christian Nastasi

WRITE CODE USING AI?
TRUST THE CODE WRITTEN BY AN AI?
THINK THAT IF THE CODE RUN, THEN IT'S CORRECT?
THINK THAT IF A BUG IS FOUND, THEN HE/SHE WILL BE IN BIG TROUBLES?

Question time!

How many of you...

That flow where you open your editor, write some lines of code, maybe ask an AI for a snippet, and just ride the wave. It feels great: fast, productive, creative.

Vibe Coding

Code that “seems to work” often hides bugs, unhandled edge cases, or even hallucinations from the AI. It runs, but it doesn’t always do what we — or our users — actually need.

How can we truly trust the code, whether we wrote it ourselves or it came from an AI?

1 million dollars question

Tools like TDD (Test Driven Development) and

BDD (Behaviour Driven Development)

allow us to formalize our expectations.

We first define what should happen, then we verify that the code

— even if AI-generated —

behaves accordingly.

Answer: take control of your code!

In this workshop, we’ll explore how TDD and BDD can be our safety net, turning vibe coding from a risky experience into a reliable practice.

Goal of the workshop

The importance of testing (Theory)
How to TDD (Theory and Practice)
How to BDD (Theory and Practice)
Let's apply them to vibe coding (Practice)

Agenda

The importance of testing

How much does it cost to not test our code?

CISQ estimates that in 2020 US companies lost approximately

The cost of not testing

$ 2.080.000.000.000

due to poor code quality

Fonte: https://www.it-cisq.org/the-cost-of-poor-software-quality-in-the-us-a-2020-report/

Most defects end up costing more than they would have cost to prevent.
Kent Beck, Extreme Programming Explained

Cost of a defect

Defects are expensive when they occur:

costs of fixing them
damaged relationships
lost business
wasted development time.

Cost of a defect

We're human. Mistakes, bugs, and logical fallacies are all around the corner.

No matter your seniority.

Errors

If you don't test, your client will, and the result is definitely not pleasant.

Example: What does a typical customer do when faced with a bugged application?

Simple, the customer goes to the competition

Fixing a bug found in production can be potentially much more complex than one found in development.

Time and complexity of solving a bug

Lack of know-how

Whoever developed that code may no longer be working for the company

Fixing a bug found in production can be potentially much more complex than one found in development.

Time and complexity of solving a bug

Lack of context

You may have lost memory of why certain technical choices were made

Fixing a bug found in production can be potentially much more complex than one found in development.

Time and complexity of solving a bug

Chain reaction

Fixing this bug could generate other bugs in features that depend on the feature we're trying to fix. Fixing it would therefore require changes in multiple places.

Fixing a bug takes time, and therefore money.

Cost and time of solving a bug

The time spent fixing bugs, if avoided, allows for more work on features and code quality.

A software house cannot ask its customers to pay for the time related to the correction of errors found in production, which results in a loss

Test types

From cheaper to expensive

They allow you to check:

Syntax
Possible bugs
Standards compliance
Duplicate code
Bad smells
Type mismatches
Some logical errors
Dead code (Code that will never be executed)

Static code analysis

Example of tools:

Sonar Cube (paid)
Scrutinizer (paid)
JS Lint
TS Lint

Static code analysis

They allow you to test individual pieces of code such as classes or individual functions, mocking all the external dependencies.

Coverage can be measured

Unit test

NOTE

100% test coverage doesn't mean the code is free of defects, but it certainly helps.

Example of tools

Jest
Mocha
Jasmine

Unit test

They allow you to verify that certain constraints have been respected

Architectural tests

Example 1 - Structure
Models should be inside the directory "Acme\Models\*"

Example 2 - Dependencies
Controllers can depends on services, but not vice versa

Example 3 - Contracts
Classes of type X must implement methods Y and Z

They tests features

Functional tests

Generally, all communications with the outside world are mocked
(e.g. databases, API calls, cache, etc.).

NOTE

100% feature coverage doesn't mean the code is free of defects,

but it certainly helps.

Test integration with different parts of your system or (typically) with external parts (e.g. Database, Cache, API, configuration*)

Integration Tests

They do NOT use mocks

They test the functioning of a complete flow in Black Box mode, simulating user interaction.

End to end tests

The test environment should be as similar as possible to the production environment

An E2E flow can include both user interaction tests and interactions with APIs or services.

Smoke Tests are part of the E2E

Minimal testing of a subset of system components, equivalent to a health check

Smoke testing

Example:
The home page returns status code 200

It can be manual or automatic

It's very time-consuming.

Checklists are often used to avoid forgetting features or edge cases.

Complete manual test

The test pyramids

Secondi

Ore

Feedback loop importance

Automate the testing process

Continuous Integration / Continuous Deployment

Delegate the task without blocking the developer

Deployment used to be done manually, and knowledge of the process often resided in the developer's head.

Standardize the deployment process

TDD

Test Driven Development

First write the test, then the code that will pass the test

TDD - Test Driven Development

Produces unit tests

Helps to focus on what is really needed

Give a formal verification that a class works as expected
(if done properly)

The process

STEP 1 - Write the first test

import { greeting } from './greeting.js';


test('should return a greeting with the given name', () => {
  // Arrange
  const name = 'Alice';

  // Act
  const result = greeting(name);

  // Assert
  expect(result).toBe('Hello, Alice!');
});

STEP 2 - Implement the minimal code

export function greeting(name) {
  return `Hello, ${name}!`;
}

If variable names were unclear → we’d rename them.

STEP 3 - Refactor

The important rule:

after every refactor, tests must stay green.

If we had duplicated logic → we’d clean it.

STEP 4 - Add new tests

test('should return a generic greeting if no name is provided', () => {
  const name = '';   					   // Arrange
  
  const result = greeting(name);   		   // Act
  
  expect(result).toBe('Hello, stranger!'); // Assert
});

STEP 5 - Update implementation

export function greeting(name) {
  if (!name) {
    return 'Hello, stranger!';
  }
  
  return `Hello, ${name}!`;
}

Is there duplication?

STEP 6 - Refactor

Is it clean?

Can we improve naming or structure?

Recap

Write the test → see it fail (RED)
Write the minimal code to pass (GREEN)
Refactor the code → keep tests passing (REFACTOR)
Add a new test → repeat cycle

Let's train it

String Calculator Kata
Roman Numerals

String Calculator Kata

Try not to read ahead.
Do one task at a time. The trick is to learn to work incrementally.
Make sure you only test for correct inputs. there is no need to test for invalid inputs for this kata

Before you start:

String Calculator Kata

Start with the simplest test case of an empty string and move to one and two numbers

Some hints:

Remember to solve things as simply as possible so that you force yourself to write tests you did not think about

Remember to refactor after each passing test

String Calculator Kata

The method can take 0, 1 or 2 numbers separated by a comma (,) and returns their sum

Step 1 - First sums

Create a function add that takes a String and returns an integer:

function add(number: string): int

An empty string ("") will return 0

// Example of input
"" 	      => 0
"1" 	  => 1
"1,2" 	  => 3
"1.1,2.2" => 3.3

String Calculator Kata

Step 2 - Many numbers

Allow the add method to handle an unknow number of arguments.

String Calculator Kata

Step 3 - New line separator

Allow the add method to handle newlines as separators

// Example of input
"1\n2,3" 	  => 6
"175,\n35" 	  => Invalid! Should return the message:
  			  	 "Number expetcted but '\n' found at position 6."

String Calculator Kata

Step 4 - Missing number last position

Don’t allow the input to end in a separator.

// Example of input
"1,3" 	  => Invalid! Should return the message:
			 "Number expected but EOF found."

BDD

Behaviour Driven Development

Follow the same approach of TDD (Test / Code / Refactor), but focus on features instead of classes/functions

BDD - Behaviour Driven Development

Describes tests using a pseudo-natural language instead of code: domain experts are able to understand and share their knowledge

If written in the right way, the same use case can be reused to test different layers of the application
(example: E2E / API testing / Functional testing)

BDD - Behaviour Driven Development

Feature: Login
  As a new user
  I want to log in to the website 
  So that the system can remember my data

  Scenario Successful Log in to the website
    Given A user brings up the login pop-up
    When A user clicks Sign-in option
    And A user enters the email "john.smit@email.com" and password "12345"
    And A user clicks Sign-in
    Then A user should be successfully logged into the site

  Scenario Unsuccessful Log in to the website
    Given A user brings up the login pop-up
    When A user clicks Sign-in option
    And A user enters the email "wrong.email@email.com" and password "wrong-password"
    And A user clicks Sign-in
    Then A user should not be successfully logged into the site
    And the warning "Invalid username or password" is shown

A library requests that we develop an application to automate its processes. Here are the main requested features.

Let's train it

The library

Let's train it

A librarian can

Filter the book catalog by Author, Title or Genre

Let's train it

A client can

Make a book reservation

Extend the loan period

Filter the book catalog by Author, Title or Genre

See the list of the loaned books

See a book details

AI-TDD

By Nastasi Christian

AI-TDD

How to keep control

Chi sono?

Technical Coach / Founder (at)Mirai Training

Host (at) DevDojo IT

Produttore di bug da ~25 anni

Christian Nastasi

Question time!

How many of you...

Vibe Coding

How can we truly trust the code, whether we wrote it ourselves or it came from an AI?

1 million dollars question

Answer: take control of your code!

Goal of the workshop

Agenda

The importance of testing

How much does it cost to not test our code?

The cost of not testing

$ 2.080.000.000.000​

Cost of a defect

Cost of a defect

Errors

Time and complexity of solving a bug

Lack of know-how

Time and complexity of solving a bug

Lack of context

Time and complexity of solving a bug

Chain reaction

Cost and time of solving a bug

Test types

From cheaper to expensive

Static code analysis

Static code analysis

Unit test

Unit test

Architectural tests

Functional tests

Integration Tests

End to end tests

Smoke testing

Complete manual test

The test pyramids

Feedback loop importance

Continuous Integration / Continuous Deployment

TDD

Test Driven Development

TDD - Test Driven Development

The process

STEP 1 - Write the first test

STEP 2 - Implement the minimal code

STEP 3 - Refactor

STEP 4 - Add new tests

STEP 5 - Update implementation

STEP 6 - Refactor

Recap

Let's train it

String Calculator Kata

Before you start:

String Calculator Kata

Some hints:

String Calculator Kata

Step 1 - First sums

String Calculator Kata

Step 2 - Many numbers

String Calculator Kata

Step 3 - New line separator

String Calculator Kata

Step 4 - Missing number last position

BDD

Behaviour Driven Development

BDD - Behaviour Driven Development

BDD - Behaviour Driven Development

Let's train it

The library

Let's train it

A librarian can

Let's train it

A client can

AI-TDD

More from Nastasi Christian

$ 2.080.000.000.000