March Treasure Hunt
or dealing with performance accidents
Agenda
- Accident
- Identify a bottleneck
- Apply a fix
- Confirm effect
Accident
16 March 2022: a spike in errors, Liveforce plarform become very slow or stop working.
Later we've got another accident, shorter but similar.
- number of active users was not changed
- number of publishes or other stuff remain the same
- MySQL slow query logs was enabled after the first accident
First checked ideas
Identify a bottleneck
16 of March
Slowest queries
SELECT t.job_id as jobId,
t.headline as jobHeadline,
MIN(t.start_date) as jobStartTime,
GROUP_CONCAT(DISTINCT t.roleId ORDER BY t.start_date ASC SEPARATOR ',') as roles
FROM (SELECT j.id as job_id, j.headline as headline, j.date as jobDate, jr.id as roleId, jr.start_date_utc as start_date
FROM job_roles as jr
JOIN jobs as j on j.id = jr.job_id
JOIN boards as b on j.board_id = b.id
LEFT JOIN (SELECT jr.id, count(jb.id) as count
FROM boards b
JOIN jobs as j on j.board_id = b.id
JOIN job_roles as jr on jr.job_id = j.id
LEFT JOIN crew_jobs as jb on jb.job_role_id = jr.id and jb.status = 'approved'
WHERE b.agency_id = 4
GROUP BY jr.id) approved on approved.id = jr.id
LEFT JOIN (SELECT j.multi_day_job_id, MIN(jr.start_date_utc) as start_date_utc
FROM boards b
JOIN jobs as j on j.board_id = b.id
JOIN job_roles jr ON jr.job_id = j.id
WHERE b.agency_id = 4
GROUP BY j.multi_day_job_id) mdj ON mdj.multi_day_job_id = j.multi_day_job_id
WHERE b.agency_id = 4
AND (jr.is_visible_when_full = 1 or jr.required_crew_count > ifnull(approved.count, 0) or EXISTS(SELECT *
FROM crew_jobs
WHERE freelancer_id = 40203
AND status = 'approved'
AND job_role_id = jr.id))
AND jr.is_visible_on_job_board = 1
AND jr.start_date_utc >= '2022-03-16 21:59:26'
AND (mdj.start_date_utc IS NULL OR mdj.start_date_utc >= NOW())) AS t
GROUP BY jobId, t.jobDate, t.headline
ORDER BY t.jobDate, jobStartTime, jobHeadline
LIMIT 1 OFFSET 0
Crew App Job Board
Errors analyse
Reproduce
I've logged in as affected account and just by scrolling on Job Board was able to partially reproduce an issue
MySQL query visualisation
15 of March: it was quick (<1s)
18 of March: became slow (~5s)
Optimized query part
GROUP BY jobId, t.jobDate, t.headline
ORDER BY t.jobDate, jobStartTime, jobHeadline
LIMIT 10 OFFSET 0
Before:
GROUP BY jobId
ORDER BY jobStartTime, jobHeadline
LIMIT 30 OFFSET 0
After:
Apply a fix
New Query visualisation (EXPLAIN)
Time to exec: <1s
Confirm effect
MySQL optimisation
By lynxeyedua
MySQL optimisation
- 7