Next Generation Data Center

Agenda

About Me
Research Notes
Overview
Virtualization
Cloud Computing
Buzzword Primer
Solution Proposal
OpenStack/Ceph

About Me

10+ years of experience in systems engineering automation
5+ years of experience in full stack web development
for 5 years part of the systems engineering team at Puzzle ITC, maintaining 200+ bare metal and virtual servers across three datacenters
for 2 years part of the development team at Atizo, a community driven, crowdsourcing platform with 15k users running on multiple cloud providers
for 1 year part of the private cloud engineering team at SWISS TXT for building a platform for SRF/RTS/RSI across two datacenters

Research Notes

40h Research
12h Presentation preperation
12h Presentation design
Informal exchange with lead srchitect of new Swisscom Application Cloud

Overview

WORKLOAD

COMPUTE

STORAGE

NETWORK

RESILIENCY

AUTOMATION

scalability
availability
redundancy
disaster recovery
RTO/RPO
distributed systems
quorum
split-brain
CAP
partition tolerance
active/active
active/passive
metro/stretch-cluster

highly available
highly automated
stateless
stateful
imutable
ephemeral
shift & load

network function virtualization
virtual network function
software defined networking
converged network
leaf-spine architecture
QoS
global loadbalancing
anycast/dns

virtualization
containers
cloud computing
self-service
multi-tenancy
hyperconvergence

SAN/NAS
virtualization
shared storage
local storage
replication
scale-out storage

configuration mangement
software defined
IaaS
Paas

WORKLOAD

COMPUTE

STORAGE

NETWORK

RESILIENCY

AUTOMATION

network function virtualization
virtual network function
software defined networking
converged network
leaf-spine architecture
QoS
global loadbalancing
anycast/dns

virtualization
containers
cloud computing
self-service
multi-tenancy
hyperconvergence

SAN/NAS
virtualization
shared storage
local storage
replication
scale-out storage

Virtualization

COMPUTE

NETWORK

VIRTUALIZATION

STORAGE

Compute Virtualization

Compute Virtualization Hypervisors

Compute Virtualization Container Engines

Storage Virtualization

Storage Virtualization Solutions

Network Virtualization

Network Virtualization Solutions

OpenStack Neutron

WORKLOAD

COMPUTE

STORAGE

NETWORK

RESILIENCY

AUTOMATION

virtualization
cloud computing
self-service
multi-tenancy
hyperconvergence

configuration mangement
software defined
IaaS
Paas

Cloud Computing

Cloud Computing
Service Models

IaaS

COMPUTE

NETWORK

VIRTUALIZATION

IAAS

STORAGE

AUTOMATION

IaaS

unified management of virtual resources
- compute, storage and network
acquire resources though a single API or UI
highly automated resource aquisition
highly abstracted
very short provisioning times
multi-tenancy

IaaS Solutions

for private clouds

PaaS

COMPUTE

NETWORK

VIRTUALIZATION

IAAS

PAAS

STORAGE

AUTOMATION

PaaS

Platform Services as Resources
Application Servers
- php, java, python, ...
Databases
- mysql, postgresql, ...
Queues and Indexes
- RabbitMQ, Elasticsearch, ...

PaaS Solutions

for private clouds

Buzzword Primer

a.k.a. Buzzword-Bingo a.k.a. Bullshit-Bingo

Workload

highly available
highly automated
immutable
ephemeral
persistent
stateful
stateless
shift & load

Compute

virtualization
cloud computing
multi-tenancy
self-service
hyperconvergence

Storage

SAN/NAS
virtualization
shared storage
local storage
replication
scale-out storage

Scale-out Storage

Leaf-Spine Architecture

Global Load Balancing

DNS
BGP anycast

Automation

configuration mangement
software defined
IaaS
Paas

Resiliency

scalability
availability
redundancy
disaster recovery
RTO/RPO
distributed systems
quorum
split-brain
CAP
partition tolerance
active/active
active/passive
metro/stretch-cluster

distributed systems are hard!

Resiliency

where is the state?

Resiliency

only distributed state is hard!

CAP

RPO/RTO

Disaster Recovery Metrics
Recovery Point Objective
- How much data is lost?
Recovery Time Objective
- How long does it take to recover?

Quorum/Split-Brain/Partition

Metrics to determine if a distributed is healthy
What happens if a distributed system falls apart?
- How does operation continue?
- What strategies exist to rebuild the system?

metro/stretch-cluster

distributed systems that span datacenters
distributed systems are already hard
even harder across datacenters

Solution Proposal

Technologies

OpenStack Cloud Computing Platform
Ceph Storage Platform

Text

Architecture

One independent OpenStack platform per datacenter
3 types of nodes: mgmt/compute/storage
Platform is built with the same automation principles as the new MSP environment

Harware Options

Three types of nodes: mgmt/compute/storage
- mgmt: open stack admin nodes
- mgmt: openstack network nodes
- mgmt: ceph admin nodes
- compute: openstack compute nodes
- storage: shared storage cluster nodes

Storage Strategy

Options for vm storage:
- shared storage & attached ephemeral local storage
  - automated provisoning & workload resiliency
- shared storage
  - automated provisioning; no workload resiliency
  - manual provisioning

Resiliency Strategies

Resiliency handled in workload
- phase one: enable fast disaster recoveries
- phase two: enable active/passive modes
- phase three: enable live traffic handling across datacenters
implement as few distributed systems as possible
- especially across datacenters
strategies on how to handle distributed system partitions must be defined and tested during implementation

OpenStack

OpenStack Statistics

OpenStack Architecture

OpenStack Mulit-Tenancy

OpenStack Networking

OpenStack and the Transformation of the Data Center

http://www.slideshare.net/lewtucker/open-stack-atlanta-2014tucker

Ceph

completely distributed storage cluster
interfaces for object, block and file-level storage
clients talks directly to storage nodes
no single point of failure
fault-tolerant
self-healing
self-managing
runs on commodity hardware
single cluster: just add disks and nodes to scale out

Next Generation Data Center

Agenda

About Me

Research Notes

Overview

Virtualization

Virtualization

Compute Virtualization

Compute Virtualization Hypervisors

Compute Virtualization Container Engines

Storage Virtualization

Storage Virtualization Solutions

Network Virtualization

Network Virtualization

Network Virtualization Solutions

Cloud Computing

Cloud Computing

Cloud Computing Service Models

Cloud Computing Service Models

IaaS

IaaS

IaaS Solutions

PaaS

PaaS

PaaS Solutions

Buzzword Primer

Workload

Compute

Storage

Scale-out Storage

Network

Leaf-Spine Architecture

Leaf-Spine Architecture

Leaf-Spine Architecture

Global Load Balancing

Automation

Resiliency

Resiliency

Resiliency

CAP

CAP

RPO/RTO

Quorum/Split-Brain/Partition

metro/stretch-cluster

Solution Proposal

Technologies

Architecture

Harware Options

Storage Strategy

Resiliency Strategies

OpenStack

OpenStack

OpenStack Statistics

OpenStack Architecture

OpenStack Architecture

OpenStack Mulit-Tenancy

OpenStack Networking

OpenStack Networking

OpenStack and the Transformation of the Data Center

Ceph

Ceph

Ceph Architecture

Ceph Architecture

Ceph Architecture

Ceph Dashboard

Ceph Intro

Next Generation Data Center

More from Simon Josi

Cloud Computing
Service Models

Cloud Computing
Service Models