AI-TDD
How to keep control
Chi sono?

Technical Coach / Founder (at)Mirai Training
Host (at) DevDojo IT
Produttore di bug da ~25 anni

Christian Nastasi

- WRITE CODE USING AI?
- TRUST THE CODE WRITTEN BY AN AI?
- THINK THAT IF THE CODE RUN, THEN IT'S CORRECT?
- THINK THAT IF A BUG IS FOUND, THEN HE/SHE WILL BE IN BIG TROUBLES?
Question time!
How many of you...

That flow where you open your editor, write some lines of code, maybe ask an AI for a snippet, and just ride the wave. It feels great: fast, productive, creative.
Vibe Coding
Code that “seems to work” often hides bugs, unhandled edge cases, or even hallucinations from the AI. It runs, but it doesn’t always do what we — or our users — actually need.

How can we truly trust the code, whether we wrote it ourselves or it came from an AI?
1 million dollars question

Tools like TDD (Test Driven Development) and
BDD (Behaviour Driven Development)
allow us to formalize our expectations.
We first define what should happen, then we verify that the code
— even if AI-generated —
behaves accordingly.
Answer: take control of your code!

In this workshop, we’ll explore how TDD and BDD can be our safety net, turning vibe coding from a risky experience into a reliable practice.
Goal of the workshop

- The importance of testing (Theory)
- How to TDD (Theory and Practice)
- How to BDD (Theory and Practice)
- Let's apply them to vibe coding (Practice)
Agenda
The importance of testing
How much does it cost to not test our code?

CISQ estimates that in 2020 US companies lost approximately
The cost of not testing
$ 2.080.000.000.000
due to poor code quality

Most defects end up costing more than they would have cost to prevent.
Kent Beck, Extreme Programming Explained

Cost of a defect

Defects are expensive when they occur:
- costs of fixing them
- damaged relationships
- lost business
- wasted development time.
Cost of a defect

We're human. Mistakes, bugs, and logical fallacies are all around the corner.
No matter your seniority.
Errors
If you don't test, your client will, and the result is definitely not pleasant.
Example: What does a typical customer do when faced with a bugged application?
Simple, the customer goes to the competition

Fixing a bug found in production can be potentially much more complex than one found in development.
Time and complexity of solving a bug
Lack of know-how
Whoever developed that code may no longer be working for the company

Fixing a bug found in production can be potentially much more complex than one found in development.
Time and complexity of solving a bug
Lack of context
You may have lost memory of why certain technical choices were made

Fixing a bug found in production can be potentially much more complex than one found in development.
Time and complexity of solving a bug
Chain reaction
Fixing this bug could generate other bugs in features that depend on the feature we're trying to fix. Fixing it would therefore require changes in multiple places.

Fixing a bug takes time, and therefore money.
Cost and time of solving a bug
The time spent fixing bugs, if avoided, allows for more work on features and code quality.
A software house cannot ask its customers to pay for the time related to the correction of errors found in production, which results in a loss
Test types
From cheaper to expensive

They allow you to check:
- Syntax
- Possible bugs
- Standards compliance
- Duplicate code
- Bad smells
- Type mismatches
- Some logical errors
- Dead code (Code that will never be executed)
Static code analysis

Example of tools:
- Sonar Cube (paid)
- Scrutinizer (paid)
- JS Lint
- TS Lint
Static code analysis

They allow you to test individual pieces of code such as classes or individual functions, mocking all the external dependencies.
Coverage can be measured
Unit test
NOTE
100% test coverage doesn't mean the code is free of defects, but it certainly helps.

Example of tools
- Jest
- Mocha
- Jasmine
Unit test

They allow you to verify that certain constraints have been respected
Architectural tests
Example 1 - Structure
Models should be inside the directory "Acme\Models\*"
Example 2 - Dependencies
Controllers can depends on services, but not vice versa
Example 3 - Contracts
Classes of type X must implement methods Y and Z

They tests features
Functional tests
Generally, all communications with the outside world are mocked
(e.g. databases, API calls, cache, etc.).
NOTE
100% feature coverage doesn't mean the code is free of defects,
but it certainly helps.

Test integration with different parts of your system or (typically) with external parts (e.g. Database, Cache, API, configuration*)
Integration Tests
They do NOT use mocks

They test the functioning of a complete flow in Black Box mode, simulating user interaction.
End to end tests
The test environment should be as similar as possible to the production environment
An E2E flow can include both user interaction tests and interactions with APIs or services.
Smoke Tests are part of the E2E

Minimal testing of a subset of system components, equivalent to a health check
Smoke testing
Example:
The home page returns status code 200
It can be manual or automatic

It's very time-consuming.
Checklists are often used to avoid forgetting features or edge cases.
Complete manual test

The test pyramids
Secondi
Ore

Feedback loop importance

Automate the testing process
Continuous Integration / Continuous Deployment
Delegate the task without blocking the developer
Deployment used to be done manually, and knowledge of the process often resided in the developer's head.
Standardize the deployment process
TDD
Test Driven Development

First write the test, then the code that will pass the test
TDD - Test Driven Development
Produces unit tests
Helps to focus on what is really needed
Give a formal verification that a class works as expected
(if done properly)
The process



STEP 1 - Write the first test
import { greeting } from './greeting.js';
test('should return a greeting with the given name', () => {
// Arrange
const name = 'Alice';
// Act
const result = greeting(name);
// Assert
expect(result).toBe('Hello, Alice!');
});

STEP 2 - Implement the minimal code
export function greeting(name) {
return `Hello, ${name}!`;
}

If variable names were unclear → we’d rename them.
STEP 3 - Refactor
The important rule:
after every refactor, tests must stay green.
If we had duplicated logic → we’d clean it.

STEP 4 - Add new tests
test('should return a generic greeting if no name is provided', () => {
const name = ''; // Arrange
const result = greeting(name); // Act
expect(result).toBe('Hello, stranger!'); // Assert
});

STEP 5 - Update implementation
export function greeting(name) {
if (!name) {
return 'Hello, stranger!';
}
return `Hello, ${name}!`;
}

Is there duplication?
STEP 6 - Refactor
Is it clean?
Can we improve naming or structure?

Recap
-
Write the test → see it fail (RED)
-
Write the minimal code to pass (GREEN)
-
Refactor the code → keep tests passing (REFACTOR)
-
Add a new test → repeat cycle

Let's train it
-
String Calculator Kata
-
Roman Numerals

String Calculator Kata
- Try not to read ahead.
-
Do one task at a time. The trick is to learn to work incrementally.
-
Make sure you only test for correct inputs. there is no need to test for invalid inputs for this kata
Before you start:

String Calculator Kata
Start with the simplest test case of an empty string and move to one and two numbers
Some hints:
Remember to solve things as simply as possible so that you force yourself to write tests you did not think about
Remember to refactor after each passing test

String Calculator Kata
The method can take 0, 1 or 2 numbers separated by a comma (,)
and returns their sum
Step 1 - First sums
Create a function add that takes a String and returns an integer:
function add(number: string): int
An empty string ("")
will return 0
// Example of input
"" => 0
"1" => 1
"1,2" => 3
"1.1,2.2" => 3.3

String Calculator Kata
Step 2 - Many numbers
Allow the add method to handle an unknow number of arguments.

String Calculator Kata
Step 3 - New line separator
Allow the add method to handle newlines as separators
// Example of input
"1\n2,3" => 6
"175,\n35" => Invalid! Should return the message:
"Number expetcted but '\n' found at position 6."

String Calculator Kata
Step 4 - Missing number last position
Don’t allow the input to end in a separator.
// Example of input
"1,3" => Invalid! Should return the message:
"Number expected but EOF found."
BDD
Behaviour Driven Development

Follow the same approach of TDD (Test / Code / Refactor), but focus on features instead of classes/functions
BDD - Behaviour Driven Development
Describes tests using a pseudo-natural language instead of code: domain experts are able to understand and share their knowledge
If written in the right way, the same use case can be reused to test different layers of the application
(example: E2E / API testing / Functional testing)

BDD - Behaviour Driven Development
Feature: Login
As a new user
I want to log in to the website
So that the system can remember my data
Scenario Successful Log in to the website
Given A user brings up the login pop-up
When A user clicks Sign-in option
And A user enters the email "john.smit@email.com" and password "12345"
And A user clicks Sign-in
Then A user should be successfully logged into the site
Scenario Unsuccessful Log in to the website
Given A user brings up the login pop-up
When A user clicks Sign-in option
And A user enters the email "wrong.email@email.com" and password "wrong-password"
And A user clicks Sign-in
Then A user should not be successfully logged into the site
And the warning "Invalid username or password" is shown

A library requests that we develop an application to automate its processes. Here are the main requested features.
Let's train it
The library

Register a book loan
Let's train it
A librarian can
Register a book restitution
Register a new book copy
Filter the book catalog by Author, Title or Genre

Let's train it
A client can
Make a book reservation
Extend the loan period
Filter the book catalog by Author, Title or Genre
See the list of the loaned books
See a book details
AI-TDD
By Nastasi Christian
AI-TDD
- 3