
Global Tenant Management using Argo

follow live
Almost 30 years in the sector
Mostly as Software Engineer
Web - 3D - Middleware - Mobile - Big Data
More recent as Architect
Data - SRE - Infrastructure
Community
Apache Beam contributor
OpenTelemetry Collector contributor
Collibra
Principal Systems Architect
Alex
Van Boxel
ArgoCon 2024
Salt Lake City
ArgoCon 2024
Salt Lake City
ArgoCon 2024
Salt Lake City

tenant
environments
operations
teams
ArgoCon 2024
Salt Lake City

tenant
environments
operations
teams
ArgoCon 2024
Salt Lake City






tenant
environments

argo-cd
resources
argo-cd
server
operations
teams
ArgoCon 2024
Salt Lake City





multi-1
multi-2

tenant
environments

argo-cd
resources
argo-cd
server
engineering
teams
operations
teams
ArgoCon 2024
Salt Lake City
Design Goal
- Development - Production parity (12 factor X). Have the same procedures and API.
- Introduce Tenant Environment Lifecycle management (12 factor XII) for creating, updating and decommissioning tenants.
- Use off the shelf software to provide a UI for troubleshooting.
ArgoCon 2024
Salt Lake City





multi-1
multi-2

tenant
environments

argo-cd
resources
argo-cd
server
ArgoCon 2024
Salt Lake City





multi-1
multi-2

tenant
environments

argo-cd
resources
argo-cd
server
tenant
manager
API
backstage
CI/CD
CRM
lookup
API
first
ArgoCon 2024
Salt Lake City





E
WF
lookup
multi-1
multi-2

tenant
environments

argo-cd
resources
argo-cd
server
tenant
manager
API
backstage
CI/CD
CRM
pub/sub
Arfo WF + E
modernising
ArgoCon 2024
Salt Lake City





E
WF
lookup
multi-1
multi-2
lfcyc-1
lfcyc-2

tenant
environments

argo-cd
resources
argo-cd
server
tenant
manager
API
pub/sub

Hooking in Multi Tenant
making the lifecycle available
ArgoCon 2024
Salt Lake City


E
WF
lookup
multi-1
multi-2
lifecycle-1
lfcyc-2

multi-3
lfcyc-3
No custom CRD
just reusing job kubernetes resource spec
ArgoCon 2024
Salt Lake City


E
WF
lookup
multi-1
multi-2
lifecycle-1
lfcyc-2

multi-3
lfcyc-3
apiVersion: batch/v1
kind: Job
metadata:
name: my-job
namespace: my-namespace
annotations:
argocd.argoproj.io/sync-options: Force=true,Replace=true
collibra.com/tenant.lifecycle: "lifecycle.upsert,lifecycle.migrate-in"
spec:
suspend: true # this makes the job suspended
template:
spec:
containers:
- name: my-containerNo custom CRD
custom annotations and suspend
ArgoCon 2024
Salt Lake City


E
WF
lookup
multi-1
multi-2
lifecycle-1
lfcyc-2

multi-3
lfcyc-3
No custom CRD
gives teams control, reuse resources
ArgoCon 2024
Salt Lake City


E
WF
lookup
multi-1
multi-2

multi-3
Discovery
auto discovery, teams can introduce it anytime
ArgoCon 2024
Salt Lake City
[
{
"name":"lifecycle-1",
"namespace":"namespace1"
},
{
"name":"lifecycle-2",
"namespace":"namespace2"
}
]


E
WF
lookup
multi-1
multi-2

multi-3
Discovery
expands the Argo workflow
ArgoCon 2024
Salt Lake City

withParam: "{{`{{steps.discovery.outputs.parameters.jobs}}`}}"

E
WF
lookup
multi-1
multi-2
lfcyc-2

multi-3
lfcyc-3
Bridge
a bridge per discovery job spec
ArgoCon 2024
Salt Lake City


E
WF
lookup
multi-1
multi-2
lifecycle-1
lfcyc-2

multi-3
lfcyc-3
ArgoCon 2024
Salt Lake City
Bridge
bridge in wf namespace, pod in job spec ns


E
WF
lookup
multi-1
multi-2
lifecycle-1
lfcyc-2

multi-3
lfcyc-3
LIFECYCLE_TENANT_ENV_ID=799a8abb-653a-41c9-bec7-a47dba840eba
LIFECYCLE_BATCH_OPERATION_ID=01931b34-6589-7112-83ea-c6640f3f3ac1
LIFECYCLE_OPERATION_NAME=lifecycle.upsert
LIFECYCLE_TENANT_ENV_FILE=/mount/point/environment.yaml
LIFECYCLE_CHANGESET_FILE=/mount/point/changeset.yamlArgoCon 2024
Salt Lake City
Lifecycle Container
files are mounted, envvar's injected


E
WF
lookup
multi-1
multi-2
lifecycle-1
lfcyc-2

multi-3
lfcyc-3
ArgoCon 2024
Salt Lake City
Bridge <> Lifecycle Container
logs are streamed back to make the visible



E
WF
lookup
multi-1
multi-2
lifecycle-1
lfcyc-2

multi-3
lfcyc-3
ArgoCon 2024
Salt Lake City
Lifecycle Container
our ansible/tf code like any multi tenant service


E
WF
lookup
multi-1
multi-2
lifecycle-1
lfcyc-2

multi-3
lfcyc-3
New - migrate out
migrating out of regions, inject object storage
annotations:
collibra.com/tenant.lifecycle: "lifecycle.migrate-out"
ArgoCon 2024
Salt Lake City


E
WF
lookup
multi-1
multi-2
lifecycle-1
lfcyc-2

multi-3
lfcyc-3
New - migrate in
the other side, import from object storage
annotations:
collibra.com/tenant.lifecycle: "lifecycle.migrate-in"
ArgoCon 2024
Salt Lake City


E
WF
lookup
multi-1
multi-2
lifecycle-1
lfcyc-2

multi-3
lfcyc-3
Admin Processes (XII)
reuse for enabling remote admin
pub/sub
annotations:
collibra.com/tenant.lifecycle: "team.operation"
ArgoCon 2024
Salt Lake City


E
WF
lookup
multi-1
multi-2
lifecycle-1
lfcyc-2

multi-3
lfcyc-3
namespace
lifecycle
argo-cd
server
Future
full containerisation
ArgoCon 2024
Salt Lake City
Discovery - Bridge
- Single small command-line program (Go)
- Born out necessity: Argo Workflows doesn't execute jobs in other namespaces
- Interacts with k8s api to read job spec, create pods, waits for the exits code and read logs
- Discovery outputs JSON that is picked up by the workflow to create parallel branches
ArgoCon 2024
Salt Lake City
Result
- We have an API to manage our thousands of tenant environments (same prod/dev)
- Under the hood Argo Events and Workflow, still only job templates are exposed to teams for managing lifecycles, giving them enough control
- Argo CD gitops flows are exposed for teams developing multi tenant services, including the means to distribute the job templates
ArgoCon 2024
Salt Lake City
ArgoCon 2024
Salt Lake City

https://slides.com/alexvanboxel/tenant_management_argocon_na2024/
https://alexvanboxel.name/
coordinates:
Global Tenant Management using Argo
By Alex Van Boxel
Global Tenant Management using Argo
For the Collibra data intelligence platform, a tenant is a combination of applications provisioned on a mixture of virtual machines and single—and multi-tenant Kubernetes services. This combination makes tenant lifecycle management non-trivial. In this session, we’ll explore using Argo Events, Workflows, and CD to build a global tenant management system. The system supports tenant lifecycle events like create, update, suspend, backup, and restore across various application types and infrastructure without exposing Argo-specific constructs to the application teams. This abstraction, called bridged workflows, allows teams to hook into the lifecycles in a simple Kubernetes native way, providing operations and developers the same simple participation in the global tenant management lifecycle. The case study will give you ideas for building a global tenant management system using the Argo suite.
- 98