Fashion Shows Engagement Update

Background

Fashion Shows is our pilot Engagement, where we will define a process through which other product teams may hand in their products to SRE when feature development is discontinued

1. Architecture Deep Dive

 

  • Production Readiness Checklist process
  • Away day
  • Running through troubleshooting process with team
  • Understanding likely failure scenarios

2. Separation of services to support

 

  • Website (blue-steel/blue-steel-api)
  • Editorial (front-row/polaroid)

3. Adressing Observability Gaps

 

  • Blue-steel unformatted logging
  • Better visibility of deployments
  • Measuring time for photo upload in Instant shows

4. SLIs/SLOs

 

  • Asking Product Managers
  • Suggested SLOs based on Availability, Latency and Photo upload time
  • Tooling for collection better latency metrics
  • Working on visibility over backends from Fastly metrics/logs 

5. Going on call

 

  • Observing
  • Quiet Fashion Shows season

6. Planning Game day

 

  • "Inject latency into blue steel" attack
  • Black hole for Fashion Shows API for one/more pod/container

Learnings 

 

  • Our pilot engagement has been great for us to understand what an engagement with a dev/product team looks like, will serve as the basis for our standardized process
     
  • Encourages us to make more re-usable Observability improvements to benefit other teams, such as:
    - Graceful shutdown logging library
    - Deployment events on Circle CI
Made with Slides.com