intro - Cloud Composer

The service to run Apache Airflow on Google Cloud

Contents

what is cloud composer?

Airflow and DAGs

Cloud Composer

CPU

RAM

Cluster Management

Security

Simplified Architecture

GKE Cluster

Schedulers x n

DAGs

updates metadata

triggers

Workers x n

views

runs on

syncs to

CI/CD

Webserver and security

HTTPS

DAGs GUI

Command Line Tools

gcloud composer environments run example-environment \
    --location us-central1 dags trigger -- sample_quickstart \
    --run-id=5077
gcloud composer environments run example-environment \
    dags list -- --output=json

Kubernetes Executor

Git repository

  • java
  • go
  • custom python packages
  • CPU x n
  • RAM x n
  • Autoscaling enabled

Schedulers x n

DAGs

triggers

triggers

VPC Native and Private IP Cluster

Cloud Composer

on-prem network

primary (nodes): 10.1.0.0/16

secondary (pods): 10.2.0.0/20

VM (Bastion Host)

Cloud NAT

Cloud Composer

primary (nodes): 10.1.0.0/16

secondary (pods): 10.2.0.0/20

SaaS on the internet

Composer 1 vs Composer 2

  • Manager your own node pool
  • Fixed node size at cluster creation time
  • Support Airflow 1 & 2
  • Pay on node resources
  • GCP manages nodes and node pool
  • Node size changeable after cluster creation
  • Support Airflow 2 only
  • Pay on POD resources
  • Workload Identify enabled by default

Composer 1 vs Composer 2

Summary

  • Use Airflow 2
  • Use Composer 2
  • Always use VPC Native Private Cluster
  • Always use Cloud NAT

intro-cloud-composer

By Richard He

intro-cloud-composer

Introduction to Cloud Composer

  • 217