Michael Kutz
Quality Engineer at REWE digital, Conference Speaker about QA & Agile, Founder of Agile QA Cologne meetup, Freelance QA Consultant
Bugs found by testers →
← Bugs reported by customers
Bug assignment with a small number of teams worked okay…
…but with more teams and more possible root causes (microservices) the bugs started piling up.
Running for
>3 h !!!
While unit, service, system and exploratory tests covered a lot of risk…
…production is still a messy place
…so teams started to monitor services in production.
Traffic
Errors
Latency
Saturation
+ business metrics
In order to improve maintainability as well, we are implementing a You Build It, You Run It policy…
So the teams do not only monitor their services during office hours…
…but also after 5 pm…
…being truly responsible for that deployment on Friday
Service specific monitoring is good, but we still want somebody to have an eye on the system as a whole…
…who is able to detect and manage global incidents.
Knowing which of the 150 services might cause the incident…
…which other services might be affected…
…which of the 25 teams might be able to help…
…and manage information between these teams, the stakeholders, customer service etc.
By Michael Kutz
Quality Engineer at REWE digital, Conference Speaker about QA & Agile, Founder of Agile QA Cologne meetup, Freelance QA Consultant