Infrastructure Vision 2023
Where should we start from ?
With a little story
Let's plan a festival

What do we need ?
checklist ?








Requirements for a little festival
- The fun part
- Space
- Acts
- enough Food/Drinks
- The not so fun
- Bathrooms/Toilets
- Some Trashcans
Requirements for a small festival
- The fun part
- more Space
- more Acts
- more and more diverse Food/Drinks
- The not so fun
- Backstage place
- more Bathrooms/Toilets
- Water and wastewater
- Disposal system
- Permissions
Requirements for
a bigger festival
- The fun part
- Space
- more Acts
- more and more diverse Food/Drinks
-
The not so fun
- multiple Backstages with restrooms and special requirements
- multiple Restroom areas
- Water and wastewater
- Disposal system
- Permissions
- Runways
- Security
- Paramedics
- Escape ways
- Regulatory frameworks

Requirements for
a large festival
- The fun part
- Space
- more Acts
- more and more diverse Food/Drinks
-
The not so fun
- multiple Backstages
- multiple Restroom areas
- Water and wastewater
- Disposal system
- Permissions
- Runways
- Security
- Paramedics
- Escape ways
- Regulatory frameworks
- Traffic management
- Backups
- Monitoring
- ...

How do we deal with the not so fun parts ?
With rulesets and logic
- Running orders
- "do and don't" lists
- team
- guests
- diagrams and plans
- enforce scalable units (tent size)
- react based in monitoring
Infrastructure Vision FOREVER

Solve infrastructure problems with code
Infrastructure as Code (IaC)
SSH is dead

Immutability

It works in my computer
It actually works in your computer too


Git is our TRUTH

A look at what we have

Declarative IaC
terraform {
version = "0.11.13"
}
provider "aws" {
region = "eu-central-1"
}
resource "aws_s3_bucket" "your_new_bucket" {
bucket = "my-first-website-cloud-native-website"
acl = "public-read"
website {
index_document = "index.html"
}
apiVersion: v1
kind: Pod
metadata:
name: nicepod
labels:
App: dev
spec:
containers:
- name: web
image: nginx
ports:
- name: web
containerPort: 80
protocol: TCP*Imperative is a possibility too



GitOps with Terraform Cloud

GitOps with ArgoCD


Special thanks to the ugly avatar bot for deploying all the stuff for us.
A hero without cape
parcelLab/deployment.git
Collaboration with Infrastructure as Code is efficient

parcelLab/infrastructure.git
EC2
Our AWS datacenter running parcelLab's EC2 instances - true story

EKS
aka "Kubernetes where AWS does all the nasty stuff"

- Lightweight
- Secure
- Open source
- Optimized for EC2
Supported and managed by AWS
Karpenter

Just-in-time Nodes

- Open source
- Improve app availability
- Lower compute costs
- Minimum operational overhead

Node 1
Node 2
Node 3
Node 3
Node 4

EC2 + Karpenter + Bottlerocket + EKS
Our AWS datacenter now - not exaggerated
parcelBazaar

Deployment as a Service
- plconfig v1 (easy deployments)
- Scaling capabilities (manual or automatic)
- Monitoring into Datadog
- Fine-grained permissions per team (jail)
- Secrets management via environment variables
- HTTPS ingress with certificate automatically created
Teams only take care of the app configuration and its container/s
Monitoring as a Service
- Datadog enablement
- Dashboards for customers, etc...
- Getting the 360° view on our system
Authentication as a Service
- Keycloak and Auth0 in evaluation
- Securing applications by configuration
- externalising administrative overhead for authorisation
The actual 2023 vision

Containerized services in Kubernetes
- Move all legacy workloads (EC2, ECS, lambdas...)
- Reduce infrastructure costs
- Spot instances
- Karpenter auto scaling
- Bottlerocket
- Monitoring to tweak resource requests and limits
- Give more options for automatic and manual scaling of workloads
- Faster scaling and adaptation to unexpected loads
Monitoring enablement
- Teams with monitoring ownership
- get a lot of stuff built in
Single Sign On everywhere
- Use the internal Microsoft account to sign into any internal tool we have
- Put Azure into Terraform so onboarding/offboarding in Microsoft is also automated
- Define group scheme and permissions in the same way we have it with Github/AWS
Ownership of common infrastructure
- Only the "base" part (common to multiple teams)
- MongoDB
- SES
- SQS...
- Goal is to enable other teams to create their own resources there (and monitor them)
Vanity domains whitelabeling
- Leverage Let'sEncrypt (already in place) to save $
- Cloudflare has some functions that need to be migrated with care
- Customer migration plan
- Self-service enablement for our colleagues


Stay tuned: January 12th 2023
Infrastructure vision v2
By andibeuge
Infrastructure vision v2
- 322