Mostly as Software Engineer
Web - 3D - Middleware - Mobile - Big Data
More recent as Architect
Data - SRE - Infrastructure
Community
Apache Beam contributor
OpenTelemetry Collector contributor
Collibra
Principal Systems Architect
AI Governance
Data Catalog
Data Governance
Data Lineage
Data Notebook
Data Privacy
Data Quality & Observability
Protect
Shared multi-tenancy saves cost — but makes it harder to figure out the cost per tenant... how can we solve this?
B
Collector(s) on the VM. This could be multiple (eg. one per signal)
A
Collector(s) installed as deamonsets.
C
Cluster wide collectors not relevant to the per node workloads
D
A collector hooked into the ingress gateway on a specific path, to capture telemetry from the browser and our edge.
21
Queuing system is an essential part of the backbone
collibra.tenant.environment_id
collibra.tenant.environment_id
{
"event_name": "workflow:started",
"tenant_environment_id": "...",
"asset_id": "..."
}CSTE - Collibra Structured Telemetry Event: Events are our golden signal
collibra.tenant.environment_id
{
"event_name": "workflow:started",
"tenant_environment_id": "...",
"asset_id": "..."
}MDC.put("tenant_environment_id",
ctx.getTenantEnvironmentId());
try {
// all logs in this thread
} finally {
MDC.clear();
}Multi-tenant service? Dev's responsibility to add signals in code, eg. Mapped Diagnostic Context
https://c4model.com/ - The C4 model is an easy to learn, developer friendly approach to software architecture diagramming (by Simon Brown)
https://c4model.com/ - The C4 model is an easy to learn, developer friendly approach to software architecture diagramming (by Simon Brown)
collibra.c4.system collibra.c4.container collibra.c4.deployment
labels:
c4.collibra.com/system: telemetry
c4.collibra.com/container: colkyvernocollibra.c4.system: telemetry
collibra.c4.container: colkyvernoModular Monoliths ( it becomes the resposability for devs
21
Queuing system is an essential part of the backbone
8
Can be sourced from different systems to merge into the data
7
Paralel pipelines do the processing, enrichment, filtering, calculation and backup to our backends
Devs don't need to know contract terms or support levels — they just log the tenant environment ID, and the backbone dynamically infers and injects the rest.
"We don't measure URLs. We measure contracts."
3
Backup of the raw data, on cheap storage.
5
We import into our data lake in batch as it's cost efficient.
9
Our data lake is where all the calculations are done for reporting, including cost attribution.
Retention is infinit.
Open problem: defensible "virtual dollar" formula for cross-team chargebacks.
① Golden attributes on day one
Define tenancy and architecture dimensions before you split into microservices, not after.
② Decouple with a backbone
Buffer-first ingestion (Pub/Sub) + centralized enrichment unlocks both ops and FinOps / BI.
③ Invest in semantic contracts
They structure your signals today and become the foundation for AI diagnostic agents tomorrow.