Lee Calcote
Clouds, containers, functions, applications and their management.
May 2019
Data Plane
Ingress Gateway
Egress Gateway
No control plane? Not a service mesh.
Egress Gateway
Control Plane
Data Plane
Ingress Gateway
Control Plane
Data Plane
You need a management plane.
Ingress Gateway
Egress Gateway
Management
Plane
Pilot
Citadel
Mixer
Control Plane
Data Plane
istio-system namespace
policy check
Foo Pod
Proxy Sidecar
Service Foo
tls certs
discovery & config
Foo Container
Bar Pod
Proxy Sidecar
Service Bar
Bar Container
Out-of-band telemetry propagation
telemetry
Â
reports
Control flow during request processing
application traffic
Application traffic
application namespace
telemetry reports
Galley
Ingress Gateway
Egress Gateway
Control Plane
Data Plane
linkerd-system namespace
Foo Pod
Proxy Sidecar
Service Foo
Foo Container
Bar Pod
Proxy Sidecar
Service Bar
Bar Container
Out-of-band telemetry propagation
telemetry
Â
scarping
Control flow during request processing
application traffic
Application traffic
application namespace
telemetry scraping
destination
Prometheus
Grafana
tap
web
CLI
proxy-api
public-api
proxy-injector
Client
Edge Cache
Istio Gateway
(envoy)
Cache Generator
Collection of VMs running APIs
service mesh
Istio VirtualService
Istio VirtualService
Istio ServiceEntry
Situation:
Â
Benefits:
Out-of-band telemetry propagation
Control flow during request processing
Application traffic
Service A
Service A
Service A
linkerd
Node (server)
Service A
Service A
Service B
linkerd
Node (server)
Service A
Service A
Service C
linkerd
Node (server)
Advantages:
Less (memory) overhead.
Simpler distribution of configuration information.
primarily physical or virtual server based; good for large monolithic applications.
Â
Disadvantages:
Coarse support for encryption of service-to-service communication, instead host-to-host encryption and authentication policies.
Blast radius of a proxy failure includes all applications on the node, which is essentially equivalent to losing the node itself.
Not a transparent entity, services must be aware of its existence.
at
Advantages:
Good starting point for building a brand-new microservices architecture or for migrating from a monolith.
Disadvantages:
When the number of services increase, it becomes difficult to manage.
Mixer
Control Plane
Data Plane
istio-system namespace
Foo Pod
Proxy sidecar
Service Foo
Foo Container
Out-of-band telemetry propagation
Control flow during request processing
application traffic
application traffic
application namespace
telemetry reports
an attribute processing engine
AppOpticsâ„¢
types: logs, metrics, access control, quota
Papertrailâ„¢
Prometheusâ„¢
Stackdriverâ„¢
Open Policy Agent
Grafanaâ„¢
Fluentd
Statsd
®
Pilot
Citadel
Mixer
istio-system namespace
Galley
Control Plane
a multi-service mesh management plane
Â
https://layer5.io/meshery
Service Mesh Interface (SMI)
a multi-service mesh management plane
Service Mesh Interface (SMI)
@lcalcote
layer5.io/books
layer5.io/landscape
Playground
WHICH SERVICE MESH SHOULD I USE AND HOW DO I GET STARTED?
Â
Learn about the functionality of different service meshes and visually manipulate mesh configuration.
Performance Benchmark
WHAT OVERHEAD DOES BEING ON THE SERVICE MESH INCUR?
Â
Benchmark the performance of your application across different service meshes and compare their overhead.
layer5.io/meshery
@lcalcote
Istio
Linkerd
Octarine
NSM
App Mesh
@lcalcote
results coming forthcoming at KubeCon EU...
Consul
Up next...
(service meshes contributing adapters)
Kubernetes
(no mesh)
Demo
@lcalcote
layer5.io/meshery
Cores | Threads | Istio (2) | Linkerd |
---|---|---|---|
8 | 8 | 1 | 1 |
8 | 16 | 1.7 | 1.8 |
8 | 32 | 3.2 | 3.4 |
8 | 100 | 9.3 | 9.6 |
(2) mTLS on, tracing off
@lcalcote
layer5.io/meshery
Cores | Threads | Istio (1) | Istio (2) | Linkerd |
---|---|---|---|---|
8 | 8 | 1 | 1 | 1 |
8 | 16 | 1.4 | 1.7 | 1.8 |
8 | 32 | 18.4 | 3.2 | 3.4 |
8 | 100 | 52.2 | 9.3 | 9.6 |
(1) mTLS on, tracing on
(2) mTLS on, tracing off
Mixer
Control Plane
Data Plane
istio-system namespace
Foo Pod
Proxy sidecar
Service Foo
Foo Container
Out-of-band telemetry propagation
Control flow during request processing
application traffic
application traffic
application namespace
telemetry reports
an attribute processing engine
@lcalcote
layer5.io/meshery
@lcalcote
layer5.io/meshery
@lcalcote
layer5.io/meshery
A project and vendor-neutral specification for capturing details of:
Environment / Infrastructure
Number and size of nodes, orchestrator
Service mesh and its configuration
Service / application details
Bundled with test results.
Â
github.com/layer5io/service-mesh-benchmark-spec
@lcalcote
layer5.io/meshery
@lcalcote
layer5.io/meshery
Service Mesh Community
Layer5.io
a dedicated layer for managing service-to-service communication
So, a microservices platform?
obviously.
Orchestrators don't bring all that you need
and neither do service meshes,
but they do get you closer.
Missing: application lifecycle management, but not by much
partially.
Missing: distributed debugging; provide nascent visibility (topology)
First few services are relatively easy
Â
Â
Democratization of language and technology choice
Â
Faster delivery, service teams running independently, rolling updates
Next 10 or so may introduce pain
Â
Â
Language and framework-specific libraries
Â
Â
Distributed environments, ephemeral infrastructure, out-moded tooling
to avoid...
Bloated service code
Duplicating work to make services production-ready
Load balancing, auto scaling, rate limiting, traffic routing...
Inconsistency across services
Retry, tls, failover, deadlines, cancellation, etc., for each language, framework
Siloed implementations lead to fragmented, non-uniform policy application and difficult debugging
Diffusing responsibility of service management
• Observability
• Logging
• Metrics
• Tracing
• Traffic Control
• Resiliency
• Efficiency
• Security
• Policy
what gets people hooked on service metrics
Metrics without instrumenting apps
Consistent metrics across fleet
Trace flow of requests across services
Portable across metric back-end providers
You get a metric! Â You get a metric! Â Everyone gets a metric!
© 2018 SolarWinds Worldwide, LLC. All rights reserved.
control over chaos
Â
Timeouts and Retries with timeout budget
Control connection pool size and request load
Circuit breakers and Health checks
Â
content-based traffic steering
Web
Service Foo
Timeout = 600ms
Retries = 3
Timeout = 300ms
Retries = 3
Timeout = 900ms
Retries = 3
Service Bar
Database
Timeout = 500ms
Retries = 3
Timeout = 300ms
Retries = 3
Timeout = 900ms
Retries = 3
Web
Service Foo
Deadline = 600ms
Deadline = 496ms
Service Bar
Database
Deadline = 428ms
Deadline=180ms
Elapsed=104ms
Elapsed=68ms
Elapsed=248ms
where Dev and Ops meet
Problem: too much infrastructure code in services
By Lee Calcote
Presented at KubeCon EU 2019.
Clouds, containers, functions, applications and their management.