Cloud ready services

Microservice Oriented Architecture in Docker checklist

Who is this guy?

Dmitry Paunin

Technical Group Manager at Lazada, OPS department

What is docker?

Docker is an open-source project that automates the deployment of applications inside software containers.

How to build a good image?

What is kubernetes

Kubernetes/K8s is "Open source container cluster", which allows:

Deploy your applications quickly and predictably
Scale your applications on the fly
Seamlessly roll out new features
Optimize use of your hardware

Why containers?Why Kubernetes?

Agile application creation and deployment
Continuous development, integration, and deployment
Dev and Ops separation of concerns
Environmental consistency across development, testing, and production
Cloud and OS distribution portability
Application-centric management
Loosely coupled, distributed, elastic, liberated micro-services
Resource isolation
Resource utilization

Answering question about "good images"!

130 checkboxes in our list...

Checklist URL

https://github.com/paunin/soa-checklist

Administrative
(people, flows, responsibilities)

Blueprint/template for a new service
Documentation, standards, guides
- how-to, know-how documents
Team support
- Understanding of the whole process by each member
- Pro-active development and support
- Accepted responsibilities and duties for each stage of a service
Plan for service live circle
- Pre-production development
- Launching
- Rollout backward compatible version of a service
- Hotfixing
- Rollout backward incompatible version of a service
  - Data migration
  - Switchover
  - Service rollback

Automated processes

Continuous development
- Tests
  - Automated(Unit tests, Functional tests, Code style: lints and sniffers, Code quality monitoring (Sonar, Scrutinizer), Code coverage checks
  - Manual (Feature acceptance/Business acceptance, A/B tests)
- Conditions of integration: Code style checks, Test results, Code coverage percentage
- Conditions of disintegration a feature: Error rate after deploy live, Helthchecks
- Storing a new tested snapshots/artefact of a service
  - Artefact storage (Docker registry)
    - Cleanup policy (Delete old tags with timeout)
Continuous delivery of stable artefacts
- Images builder
- Services provisioning (Ansible)

Implementation

Architecture Layers

Implementation
vol.1

Hardware: Servers and networks

Scaling (adding new nodes) should not affect consistency of other layers
Degradation (removing nodes) should not affect consistency of other layers
Monitoring
- Hardware
- Network
- Resources and load
Alerting policy

Implementation
vol.2

Cluster: Services management system

Monitoring
- Availability of each node in the cluster
- All services up and running
- Connectivity between different pods and services
- Public endpoints accessibility
Alerting policy
Restart (full or partial) should bring cluster and systems up without destruction
Log aggregation system - collect all logs from all containers

Execution environment
- Meta-project with topology of the system
  - Showroom + Staging
    - Separate namespace for each showroom
    - Fixed showroom for the staging (last stable pre-release)
  - Production
    - Configuration
      - Secrets
      - Configs should be a part of the meta-project

Implementation
vol.3

Service: Application and any service

Service itself (Docker image)
- Backward compatibility for a few generations
  - Cleanup policy for deprecated/unused:
    - Logic branches
    - Data structures (RDBMS/NoSql)
- One container - one process
  - Segregated commands even in one image (management layer can pick any to run)
  - Built in commands
    - Test service/source code (docker compose to setup required test ENV)
  - DEV/DEBUG mode
- Logging
  - Writing in stdout (without using containers’ file system) will enforce cluster layer to keep all logs

...Service itself (Docker image)
- Monitoring
  - Application and business checks (New Relic: throughput, metrics)
  - Self health checks (metrics+Prometeus+Grafana)
    - Queues content (amount of messages)
    - Db content (custom checks)
    - Cache utilization check
- Alerting policies (Prometeus, NewRelic)

...Service itself (Docker image)
- Self-sufficiency
  - Interfaces documentation
    - Restful API
      - Swagger
    - Port and service description (README.md files)
  - Service should be able to set itself up
    - Wait for required related services and ports (dockerize)
    - Configuring from environment variables (confd)
    - Warming up
      - Run data migration (needed maintenance service)
      - Cache fulfilment

Replication, balancing and scaling on service level
Failover and self-reorganisation in case of:
- Service crashed
- Physical node out of cluster
- Resources problems on specific node
Logs system
- Service to collect and access logs grabbed from Cluster layer
  - ELK stack/Gray Log/etc
Persistent volumes to keep data
- EBS AWS
- Ceph
- NFS

Implementation
vol.4

Common services

Single sign-on service
- Authentication service (JWT)
- Authorization requests from all services
Detached processing (CQRS)
- Request-Queue-Processor schema
- Stream data addressing and processing (Reactor)
Real Time data requests processing
- Reliable data provider/API gateway (sync data retrieving)
  - Request-Manager-Service solution
Reliable data-bus for events
- Event-Broker-Subscriber solution (Apache Camel)
  - Http/TCP API endpoint to accept events
  - Event fulfilment (Earn required information for subscribers)
  - Event delivery
  - Event delivery policies: Retry, Reque, Giveup

RDBMS: Postgres cluster
DB backups: PG backupper
Key-value + Queue: Redis cluster
Messages system: Rabbit MQ cluster
Healthcheck system
Alerting system

Thanks. Questions?

https://slides.com/dmitriypaunin/cloud-ready-services

Slides URL

Cloud ready services

By Dmitriy Paunin

Cloud ready services

1,058

Cloud ready services

Who is this guy?

Dmitry Paunin

Technical Group Manager at Lazada, OPS department

What is docker?

How to build a good image?

What is kubernetes

Why containers?Why Kubernetes?

Answering question about "good images"!

130 checkboxes in our list...

Checklist URL

Administrative (people, flows, responsibilities)

Automated processes

Implementation

Architecture Layers

Implementation vol.1 Hardware: Servers and networks

Implementation vol.2

Cluster: Services management system

Implementation vol.3

Service: Application and any service

Implementation vol.4

Common services

Thanks. Questions?

Cloud ready services

Administrative
(people, flows, responsibilities)

Implementation
vol.1

Hardware: Servers and networks

Implementation
vol.2

Implementation
vol.3

Implementation
vol.4