some DevOps ideas
tools to measure a Pipeline (PL) status:
measures five key characteristics (Can be extended):
Source: https://sre.google/workbook/data-processing/#pipeline-maturity-matrix
e.g. of Service Level Objectives and Indicators
source: https://cloud.google.com/solutions/building-production-ready-data-pipelines-using-dataflow-planning#defining_and_measuring_slos
Data Partition: looks logical to partition per datetime (by day?), by source
Initial phases (can overlap):
Goals: improve on PL Maturity Matrix, SLOs + SLIs, reduce costs (time to market, money of running Infra), CI/CD, automate operations, security everywhere
Reactive Services 4 key tenets:
Divide and Conquer
close work with Tech Team is a must!
Goals:
Expected Outcome:
Actions:
e.g. Questions:
Goals:
Given: a Rails sidekiq worker that is fully configurable (via a Configuration System) at server creation or reconfigured+restarted
Then: configure Workers to
- read from queue x, write to queue y => Event
- read from source n, write to sink m => Data
Question: where are the queues configuration now?
This achieves Isolation of jobs/PLs: Load Balance jobs (add more workers for a high-priority queue), monitor/observe separately, etc