Server Memory

Debugging Adventures

First Contact

First Contact

First Contact

~6:20 PM?

What requests were happening at 6:20 last night?

What was deployed? What was changed?

Investigation

Investigation

🤔

Investigation

Investigation

6:10 - 6:40 PM

Investigation

🤔

🤔

🤔

Investigation

Debugging (Puma)

memory increasing forever

😰

Debugging (Unicorn)

🔪

🔪

🔪

😰

Findings

  • /reports_dashboard/cycles was allocating >900 MB
  • Unicorn was killing workers after 60 secs

this is why we haven't seen memory problems before

UWK: per worker,  PWK: per cluster

  • With this testing, Unicorn used ~5% more RSS
  • MAX_MEMORY_WEB is not set on staging
  • MAX_MEMORY_WEB is interpreted differently for PumaWorkerKiller and UnicornWorkerKiller

Actions

  • New Configuration for MAX_CLUSTER_MEMORY
  • Add Rack::Timeout for handing request timeout
  • Incremental Rollout of Puma

Learnings

  • See Something, Say Something
  • Look at the Metric & Logs
  • Know & Build Debug Tools
  • Test Your Hypotheses
  • Pairing: Fun and Effective

💜

Made with Slides.com