Server Memory
Debugging Adventures
First Contact

First Contact


First Contact

~6:20 PM?
What requests were happening at 6:20 last night?
What was deployed? What was changed?
Investigation


Investigation

🤔
Investigation



Investigation

6:10 - 6:40 PM
Investigation


🤔
🤔
🤔
Investigation

Debugging (Puma)

memory increasing forever
😰
Debugging (Unicorn)


🔪
🔪
🔪
😰
Findings
- /reports_dashboard/cycles was allocating >900 MB
- Unicorn was killing workers after 60 secs
this is why we haven't seen memory problems before
UWK: per worker, PWK: per cluster
- With this testing, Unicorn used ~5% more RSS
- MAX_MEMORY_WEB is not set on staging
- MAX_MEMORY_WEB is interpreted differently for PumaWorkerKiller and UnicornWorkerKiller
Actions
- New Configuration for MAX_CLUSTER_MEMORY
- Add Rack::Timeout for handing request timeout
- Incremental Rollout of Puma
Learnings
- See Something, Say Something
- Look at the Metric & Logs
- Know & Build Debug Tools
- Test Your Hypotheses
- Pairing: Fun and Effective
💜
deck
By Tony Ta
deck
- 115