Continuous auditing of Amazon Web Services with serverless Amazon Lambda Python functions leveraged by using the Cloud Custodian framework to encourage learning opportunities for engineers

#CAOAWSWSALPFCCFLOFE

Auditing AWS Serverlessly

using Cloud Custodian

Who is

This
Guy?

Thomas Krag

👨🏻‍🔬 Co-founder of Method (we're not hiring)

🤦🏻‍♂️ DevOps Engineer

🇩🇰 I am Danish

🍺 I enjoy refreshments

💻 I work with computers

🎤 I organize conferences

🎸 I used to work with music

🐍 I write some Python

☕️ I write some NodeJS

🥃 I enjoy refreshments

Who is

Method?

What is
Cloud
Custodian?

Cloud Custodian

  • rules engine
  • YAML DSL
  • real-time compliance
  • runs in AWS Lambda
  • OSS by CapitalOne

github.com/capitalone/cloud-custodian

Why
Would I
Care?

Examples

Triggers

  • CloudWatch
  • EC2 Instance State
  • Periodic (cron)
policies:
  - name: s3-bucket-check
    resource: s3
    mode:
      type: periodic
      schedule: "rate(1 day)"

Scheduled/cron

policies:
  - name: s3-bucket-check
    resource: s3
    mode:
      type: periodic
      schedule: "cron(0 12 * * *)"

Let's solve some problems

I keep hitting
AWS resource limits

policies:
  - name: account-service-limits
    resource: account
    region: eu-west-1
    filters:
      - type: service-limit
        services:
          - EC2
        threshold: 50

I want to keep my RDS instances secure

policies:
  - name: rds-unencrypted-public-removal
    resource: rds
    mode:
      type: cloudtrail
      role: arn:aws:iam::{account_id}:role/FullLambda
      events:
        - CreateDBInstance
    filters:
      - or:
          - StorageEncrypted: false
          - PubliclyAccessible: true
    actions:
      - type: delete
        skip-snapshot: true

Left-over EBS volumes are making my AWS bill 'splode 💥

- name: ebs-mark-unattached-deletion
  resource: ebs
  filters:
    - Attachments: []
    - "tag:maid_status": absent
  actions:
    - type: mark-for-op
      op: delete
      days: 30

- name: ebs-unmark-attached-deletion
  resource: ebs
  filters:
    - type: value
      key: "Attachments[0].Device"
      value: not-null
    - "tag:maid_status": not-null
  actions:
    - unmark

- name: ebs-delete-marked
  resource: ebs
  filters:
    - type: marked-for-op
      op: delete
  actions:
    - delete

We have 1k+ EC2 instances but none of them are properly tagged

- name: ec2-tag-compliance-mark
  resource: ec2
  filters:
    - "tag:aws:autoscaling:groupName": absent
    - "tag:c7n_status": absent
    - "tag:Owner": absent
    - "tag:CostCenter": absent
    - "tag:Project": absent
  actions:
    - type: mark-for-op
      op: stop
      days: 1

- name: ec2-tag-compliance-unmark
  resource: ec2
  filters:
    - "tag:Owner": not-null
    - "tag:CostCenter": not-null
    - "tag:Project": not-null
    - "tag:c7n_status": not-null
  actions:
    - unmark
    - start
- name: ec2-tag-compliance-stop
  resource: ec2
  filters:
    - "tag:aws:autoscaling:groupName": absent
    - "tag:Owner": absent
    - "tag:CostCenter": absent
    - "tag:Project": absent
    - type: marked-for-op
      op: stop
  actions:
    - stop
    - type: mark-for-op
      op: terminate
      days: 3

- name: ec2-tag-compliance-terminate
  resource: ec2
  filters:
    - "tag:aws:autoscaling:groupName": absent
    - "tag:Owner": absent
    - "tag:CostCenter": absent
    - "tag:Project": absent
    - type: marked-for-op
      op: terminate
  actions:
    - type: terminate
      force: true

We allow 'the internet' in our Security Groups

policies:
  - name: are-you-nuts
    resource: security-group
    mode:
        type: cloudtrail
        events:
          - source: ec2.amazonaws.com
            event: AuthorizeSecurityGroupIngress
            ids: "requestParameters.groupId"
          - source: ec2.amazonaws.com
            event: AuthorizeSecurityGroupEgress
            ids: "requestParameters.groupId"
          - source: ec2.amazonaws.com
            event: RevokeSecurityGroupEgress
            ids: "requestParameters.groupId"
          - source: ec2.amazonaws.com
            event: RevokeSecurityGroupIngress
            ids: "requestParameters.groupId"
    filters:
      - type: ingress
        Cidr:
            value: "0.0.0.0/0"
    actions:
        - type: remove-permissions
          ingress: matched

But wait! There's more...

Manager Approval 101

policies:
  - name: stop-after-hours
    resource: ec2
    filters:
      - type: offhour
        tag: StopAfterHours
        default_tz: utc
        offhour: 20
      - type: instance-age
        hours: 1
    actions:
      - stop


  - name: start-after-hours
    resource: ec2
    filters:
      - type: onhour
        tag: StartAfterHours
        default_tz: utc
        onhour: 8
      - type: value
        value: 1
        key: LaunchTime
        op: less-than
        value_type: age
    actions:
      - start

💰

💰

We're doing it live!

💻

Summary

Questions?

@vikingopsio / @withmethod

Tak! 🍻

slides.com/vikingops/custodian/

Auditing AWS Serverlessly

By Thomas Krag

Auditing AWS Serverlessly

Basic introduction into using Cloud Custodian.

  • 485