Using observability to scale AWS Lambda

bene@theodo.co.uk

Ben Ellerby

@EllerbyBen

Ben Ellerby

@EllerbyBen

http://serverless-transformation.com/

https://www.theodo.co.uk/experts/serverless

Alex White

@agwhi_

@EllerbyBen

Serverless

What is this Serverless thing?

Architectural movement
- “allows you to build and run applications and services without thinking about servers” — AWS
- Developers send application code which is run by the cloud provider in isolated containers abstracted from the developer.
- Use 3rd party services used to manage backend logic and state (e.g. Firebase, Cognito)
A framework with the same name

@EllerbyBen

Why Serverless?

💰 Cost reduction

👷‍♂️ #NoOps... well LessOps

💻 Developers focus on delivering business value

📈 More scalable

🌳 Greener

@EllerbyBen

Not just Lambda (FaaS)

Lambda

Dynamo

API Gateway

Compute

Storage

Data

API Proxy

Cognito

Auth

SQS

Queue

Step Functions

Workflows

EventBridge

Bus

@EllerbyBen

Power and Flexibility to build...

@EllerbyBen

Optimising Lambda during Development

@agwhi_

Nathan Malishev

https://levelup.gitconnected.com/aws-lambda-cold-start-language-comparisons-2019-edition-%EF%B8%8F-1946d32a0244

@agwhi_

Improving performance

Reduce cold starts
Power tuning
Architecture/code

@agwhi_

Cold Starts

Code hasn't been executed in a while
Scaling up
Rebalancing across availability zones
Updating code/config flushes

@agwhi_

Improving Cold Starts

Frequency

Duration

@agwhi_

Provisioned Concurrency

@agwhi_

Duration

Measuring with x-ray

@agwhi_

Cold Starts times

Package size
Runtime
Amount of code
Amount of initialisation work

@agwhi_

Reducing Duration

Avoid monolithic functions
Minify code
- Webpack
Optimise imports
- Only import the parts of the library you're using
- lazyload dependencies that might not be used

HTTP Keep-Alive

Reuse TCP connections between requests
Reduce DynamoDB operation from 30ms to 10ms
Easy to set up

@agwhi_

Power Tuning

@agwhi_

Memory = Power

@agwhi_

Power Tuning

https://github.com/alexcasalboni/aws-lambda-power-tuning

@agwhi_

Power Tuning

Data-driven cost and performance
optimisation
Available as an AWS Serverless
Application Repository app
Can integrate with CI/CD

@agwhi_

Input

Output

@agwhi_

CPU-bound example

@agwhi_

Architecture/Code

@agwhi_

Distributed Tracing

@agwhi_

Common mistakes

Fetching more data than you need
Not using related services well
- Scans in dynamoDB
Defaulting to synchronous execution

@agwhi_

Moving to Async

Sync

You pay while your lambda
Downstream slowdown affects the lambda
Needs custom code for error handling and retries

Async

Minimizes cost of waiting
Queueing separates fast and slow processes
Managed services provide reliability features

@agwhi_

Do you need a lambda?

Move orchestration out of your lambdas
- avoid paying for idle time
- Use step functions
Move data transport out of functions
- "Transform not transport"
- Use VTL when possible
  - Access dynamoDB directly

@agwhi_

Parallelise Code

Make use of promises (in nodejs) to parallelise processing
AVX2 announced at re:invent 2020

@agwhi_

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

- Donald Knuth

@agwhi_

Conclusion

Cold Starts

AWS optimises the first part for us
What we can change
Provisioned Concurrency

Power tuning

Memory = Power
AWS Lambda Power Tuning
Cost vs Speed

Architecture/Code

Move to async
You don't always need a lambda
Use parrelisation

@agwhi_

Load Testing

@EllerbyBen

Microservices

@EllerbyBen

How does Serverless Scale Differently?

Pay-per-use => leading to denial of wallet
AWS Service Limits (both hard and soft)
Outpace non-serverless components / third parties
Cold start impacts for sudden spikes
Combination of multiple services in a distributed system makes bottlenecks harder to spot
Regional distribution of traffic

@EllerbyBen

What is Load Testing?

Different things to different people.

@EllerbyBen

Simulating different concurrent traffic levels on an application to validate its scalability.

Protocol vs Browser

2 Types:
- Protocol Based: Simulating at the API level
- Browser Based: Spinning up browsers and simulating interactions with browser elements to trigger realistic protocol-level requests.
Typically, Serverless Architectures are best tested at the Protocol level for as the scale of testing is usually high and browser simulated testing at this level would be expensive a slow.
Protocol could be HTTP API requests, or more custom triggering of the AWS SDK Directly

@EllerbyBen

Components of a good load test

Exact replica of production infrastructure
Observability tooling
Repeatable scenarios
Ability to simulate high load
Realistic user flows
Geographic distribution

@EllerbyBen

Example Application: Gamercraft

@EllerbyBen

The Gamercraft platform needed the ability to support a massive volume of users and accommodate traffic spikes during large-scale tournaments and low usage periods

Gamercraft

@EllerbyBen

Gamercraft

@EllerbyBen

What we want test?

Validate our cost estimates as load increases
Identify AWS Service Limits that need raising
Identify AWS Service Limits that can't be raised
Verify 3rd party and non-serverless components are protected from spikes
Identify hidden bottlenecks
Verify impact of regional distribution

@EllerbyBen

How do I start?

@EllerbyBen

🤷‍♂️

Environment to Test Against

@EllerbyBen

🌎

Isomorphic Ephemeral Load Testing Environments

@EllerbyBen

100% serverless architectures can be deployed to short lived environments.
In "Serverless Flow" we spin up an environment per PR to run integration testing.
There is 0 mocking, and the architecture is isomorphic to production
This same approach can be taken for isolated load testing.

* Non-serverless components and 3rd parties add complexity

Metrics

@EllerbyBen

📊

Basic Metrics

@EllerbyBen

Response times
Error rates
Throttles

Know what’s happening

@EllerbyBen

The flexibility, distribution and granularity of Serverless architectures makes logging hard.
Cloudwatch & XRay are the minimum.

@EllerbyBen

CloudWatch Lambda Insights

@EllerbyBen

Dedicated Observability Service

Load Testing Toolkit

@EllerbyBen

🛠

Artillery

@EllerbyBen

Artillery is a load testing and smoke testing solution for SREs, developers and QA engineers

Artillery - Test Definition

@EllerbyBen

config:
  target: "https://shopping.service.staging"
  phases:
    - duration: 60
      arrivalRate: 5
      name: Warm up
    - duration: 120
      arrivalRate: 5
      rampTo: 50
      name: Ramp up load
    - duration: 600
      arrivalRate: 50
      name: Sustained load
  payload:
    # Load search keywords from an external CSV file and make them available
    # to virtual user scenarios as variable "keywords":
    path: "keywords.csv"
    fields:
      - "keywords"
scenarios:
  # We define one scenario:
  - name: "Search and buy"
    flow:
      - post:
          url: "/search"
          body: "kw={{ keywords }}"
          # The endpoint responds with JSON, which we parse and extract a field from
          # to use in the next request:
          capture:
            json: "$.results[0].id"
            as: "id"
      # Get the details of the product:
      - get:
          url: "/product/{{ id }}/details"
      # Pause for 3 seconds:
      - think: 3
      # Add product to cart:
      - post:
          url: "/cart"
          json:
            productId: "{{ id }}"

artillery run search-and-add-to-cart.yml

But where would we run this from?

A server... 🤮

@EllerbyBen

What if there was another way?

A service that can run code (without us having to managing servers) with support for massive parallel scale?

@EllerbyBen

Serverless-Artillery (slsart)

@EllerbyBen

Combine serverless with artillery and you get serverless-artillery for instant, cheap, and easy performance testing at scale.

Serverless-Artillery (slsart)

@EllerbyBen

Running From Different AWS Account

@EllerbyBen

We are running our load test using AWS Services. (i.e. Lambda)
We don't want the load-testing infra to impact limits on our infra under test
More realistic traffic paths

Committing Expermients

@EllerbyBen

All tests should be repeatable experiments.
The context for the test, scenario templates and results should all be committed to the repo.
Allows future analysis and repeating of experiments.

Components of a good load test

Exact replica of production infrastructure
Observability tooling
Repeatable scenarios
Ability to simulate high load
Realistic user flows
Geographic distribution
Committed repeatable tests

@EllerbyBen

Conclusion

@EllerbyBen

🌎

📊

🛠

https://slides.com/alexwhite

@EllerbyBen

serverless-transformation

Serverless Optimisation Workshop

@agwhi_

Alex White Joint Presentation: Using observability to scale AWS Lambda

By Ben Ellerby

Alex White Joint Presentation: Using observability to scale AWS Lambda

Serverless architectures on AWS, involving services like AWS Lambda, DynamoDB, Cognito, Step Functions, API Gateway, bring instant scalability when built and configured in the correct way. We’ll look at how AWS Serverless architectures need to be treated differently to ensure optimal scalability and how Serverless tools (like Serverless Artillery) can be used to verify scalability. Not only will we look at achieving scalability, we’ll also look at the tools and techniques to predict and limit the cost of scaling. To bring these topics to life we’ll look at the architecture of 2 live Serverless applications built on AWS for scale and discuss how they were architected, how costs were monitored and kept in line and how serverless load testing was used to verify scalability and catch edge cases.

Using observability to scale AWS Lambda

Serverless

What is this Serverless thing?

Why Serverless?

Not just Lambda (FaaS)

Power and Flexibility to build...

Optimising Lambda during Development

Improving performance

Cold Starts

Improving Cold Starts

Provisioned Concurrency

Duration

Measuring with x-ray

Cold Starts times

Reducing Duration

HTTP Keep-Alive

Power Tuning

Power Tuning

Power Tuning

CPU-bound example

Architecture/Code

Distributed Tracing

Common mistakes

Moving to Async

Do you need a lambda?

Parallelise Code

Conclusion

Load Testing

Microservices

How does Serverless Scale Differently?

What is Load Testing?

Protocol vs Browser

Components of a good load test

Example Application: Gamercraft

Gamercraft

Gamercraft

What we want test?

How do I start?

🤷‍♂️

Environment to Test Against

🌎

Isomorphic Ephemeral Load Testing Environments

Metrics

📊

Basic Metrics

Know what’s happening

The flexibility, distribution and granularity of Serverless architectures makes logging hard.

Cloudwatch & XRay are the minimum.

CloudWatch Lambda Insights

Dedicated Observability Service

Load Testing Toolkit

🛠

Artillery

Artillery - Test Definition

But where would we run this from? A server... 🤮

What if there was another way?

A service that can run code (without us having to managing servers) with support for massive parallel scale?

Serverless-Artillery (slsart)

Serverless-Artillery (slsart)

Running From Different AWS Account

Committing Expermients

Components of a good load test

Conclusion

🌎

📊

🛠

Alex White Joint Presentation: Using observability to scale AWS Lambda

More from Ben Ellerby

But where would we run this from?

A server... 🤮