That flow where you open your editor, write some lines of code, maybe ask an AI for a snippet, and just ride the wave. It feels great: fast, productive, creative.
Code that “seems to work” often hides bugs, unhandled edge cases, or even hallucinations from the AI. It runs, but it doesn’t always do what we — or our users — actually need.
Tools like TDD (Test Driven Development) and
BDD (Behaviour Driven Development)
allow us to formalize our expectations.
We first define what should happen, then we verify that the code
— even if AI-generated —
behaves accordingly.
In this workshop, we’ll explore how TDD and BDD can be our safety net, turning vibe coding from a risky experience into a reliable practice.
CISQ estimates that in 2020 US companies lost approximately
due to poor code quality
Most defects end up costing more than they would have cost to prevent.
Kent Beck, Extreme Programming Explained
Defects are expensive when they occur:
We're human. Mistakes, bugs, and logical fallacies are all around the corner.
No matter your seniority.
If you don't test, your client will, and the result is definitely not pleasant.
Example: What does a typical customer do when faced with a bugged application?
Simple, the customer goes to the competition
Fixing a bug found in production can be potentially much more complex than one found in development.
Whoever developed that code may no longer be working for the company
Fixing a bug found in production can be potentially much more complex than one found in development.
You may have lost memory of why certain technical choices were made
Fixing a bug found in production can be potentially much more complex than one found in development.
Fixing this bug could generate other bugs in features that depend on the feature we're trying to fix. Fixing it would therefore require changes in multiple places.
Fixing a bug takes time, and therefore money.
The time spent fixing bugs, if avoided, allows for more work on features and code quality.
A software house cannot ask its customers to pay for the time related to the correction of errors found in production, which results in a loss
They allow you to check:
Example of tools:
They allow you to test individual pieces of code such as classes or individual functions, mocking all the external dependencies.
Coverage can be measured
NOTE
100% test coverage doesn't mean the code is free of defects, but it certainly helps.
Example of tools
They allow you to verify that certain constraints have been respected
Example 1 - Structure
Models should be inside the directory "Acme\Models\*"
Example 2 - Dependencies
Controllers can depends on services, but not vice versa
Example 3 - Contracts
Classes of type X must implement methods Y and Z
They tests features
Generally, all communications with the outside world are mocked
(e.g. databases, API calls, cache, etc.).
NOTE
100% feature coverage doesn't mean the code is free of defects,
but it certainly helps.
Test integration with different parts of your system or (typically) with external parts (e.g. Database, Cache, API, configuration*)
They do NOT use mocks
They test the functioning of a complete flow in Black Box mode, simulating user interaction.
The test environment should be as similar as possible to the production environment
An E2E flow can include both user interaction tests and interactions with APIs or services.
Smoke Tests are part of the E2E
Minimal testing of a subset of system components, equivalent to a health check
Example:
The home page returns status code 200
It can be manual or automatic
It's very time-consuming.
Checklists are often used to avoid forgetting features or edge cases.
Secondi
Ore
Automate the testing process
Delegate the task without blocking the developer
Deployment used to be done manually, and knowledge of the process often resided in the developer's head.
Standardize the deployment process
First write the test, then the code that will pass the test
Produces unit tests
Helps to focus on what is really needed
Give a formal verification that a class works as expected
(if done properly)
import { greeting } from './greeting.js';
test('should return a greeting with the given name', () => {
// Arrange
const name = 'Alice';
// Act
const result = greeting(name);
// Assert
expect(result).toBe('Hello, Alice!');
});
export function greeting(name) {
return `Hello, ${name}!`;
}
If variable names were unclear → we’d rename them.
The important rule:
after every refactor, tests must stay green.
If we had duplicated logic → we’d clean it.
test('should return a generic greeting if no name is provided', () => {
const name = ''; // Arrange
const result = greeting(name); // Act
expect(result).toBe('Hello, stranger!'); // Assert
});
export function greeting(name) {
if (!name) {
return 'Hello, stranger!';
}
return `Hello, ${name}!`;
}
Is there duplication?
Is it clean?
Can we improve naming or structure?
Write the test → see it fail (RED)
Write the minimal code to pass (GREEN)
Refactor the code → keep tests passing (REFACTOR)
Add a new test → repeat cycle
String Calculator Kata
Roman Numerals
Do one task at a time. The trick is to learn to work incrementally.
Make sure you only test for correct inputs. there is no need to test for invalid inputs for this kata
Start with the simplest test case of an empty string and move to one and two numbers
Remember to solve things as simply as possible so that you force yourself to write tests you did not think about
Remember to refactor after each passing test
The method can take 0, 1 or 2 numbers separated by a comma (,)
and returns their sum
Create a function add that takes a String and returns an integer:
function add(number: string): int
An empty string ("")
will return 0
// Example of input
"" => 0
"1" => 1
"1,2" => 3
"1.1,2.2" => 3.3
Allow the add method to handle an unknow number of arguments.
Allow the add method to handle newlines as separators
// Example of input
"1\n2,3" => 6
"175,\n35" => Invalid! Should return the message:
"Number expetcted but '\n' found at position 6."
Don’t allow the input to end in a separator.
// Example of input
"1,3" => Invalid! Should return the message:
"Number expected but EOF found."
Follow the same approach of TDD (Test / Code / Refactor), but focus on features instead of classes/functions
Describes tests using a pseudo-natural language instead of code: domain experts are able to understand and share their knowledge
If written in the right way, the same use case can be reused to test different layers of the application
(example: E2E / API testing / Functional testing)
Feature: Login
As a new user
I want to log in to the website
So that the system can remember my data
Scenario Successful Log in to the website
Given A user brings up the login pop-up
When A user clicks Sign-in option
And A user enters the email "john.smit@email.com" and password "12345"
And A user clicks Sign-in
Then A user should be successfully logged into the site
Scenario Unsuccessful Log in to the website
Given A user brings up the login pop-up
When A user clicks Sign-in option
And A user enters the email "wrong.email@email.com" and password "wrong-password"
And A user clicks Sign-in
Then A user should not be successfully logged into the site
And the warning "Invalid username or password" is shown
A library requests that we develop an application to automate its processes. Here are the main requested features.
Register a book loan
Register a book restitution
Register a new book copy
Filter the book catalog by Author, Title or Genre
Make a book reservation
Extend the loan period
Filter the book catalog by Author, Title or Genre
See the list of the loaned books
See a book details