Performance
Performant system
Responsive (fast)
Stable (doesn't explode)
... under a heavy workload
Why?
More and more partners ask for SLAs on response times
(or more transparency)
It improves the quality of our product
We need to scale
We need to be profitable and it reduces costs
How do we measure it?
Performance tests
Performance testing is a non-functional software testing technique that determines how the stability, speed, scalability, and responsiveness of an application holds up under a given workload.
Apps
(or anything else)
Prometheus
https://k6.io
https://monitoring.prod.oina.ws - Load testing dashboard
How to write a performance test?
Engineering / Performance
k6 tests are simple TypeScript scripts
https://gitlab.com/swan-io/commons/performance-tests
export const options: Options = {
stages: [
{ duration: "10s", target: 3 },
{ duration: "30s", target: 3 },
{ duration: "10s", target: 0 },
],
tags: {
namespace: env.NAMESPACE,
},
};
const graphqlQuery = graphql(`
query ProjectInfoById($id: ID!) {
projectInfoById(id: $id) {
__typename
id
...
}
}
`);
export default function () {
const projectId = getRandomProjectId();
const response = sendGqlRequest(urls.unauthenticated, graphqlQuery, {
id: projectId,
});
check(response, {
"status is 200 and has no error": (r) =>
r.status === 200 && r.body.errors === undefined,
});
}
How to run a performance test?
With Backstage
Engineering productivity squad:
https://backstage.prod.oina.ws/performance-testing
How do we prioritize work?
- high workload/volume
- high response times
- scalability issues/challenges
Focus on use cases that are currently the least performant
Engineering / Performance
https://monitoring.prod.oina.ws/d/performance-metrics
Team
Thibaut Villeneuve
Alexandre Pinon
How to solve performance issues?
It's case-by-case
There are often low-hanging fruits
- adding missing SQL indexes
- adding missing GraphQL Dataloaders
- avoid fetching info in GraphQL resolvers that's available from the parent
- avoid fetching useless info
- parallelizing more stuff
Recent wins
users
query
4 Virtual users (= 4 requests in parallel all the time) & 100k memberships: 100% timeout
7 Virtual users & 100k memberships: no error
no graph :( but it's big I swear
Various improvements in A-C
release
Various improvements in A-C
release
Dataloader in paginated queries
release
#engineering-performance
Thank you
Performance (@ Swan)
By antogyn
Performance (@ Swan)
- 60