Using observability to scale AWS Lambda
bene@theodo.co.uk
Ben Ellerby
@EllerbyBen
Ben Ellerby
@EllerbyBen
http://serverless-transformation.com/
https://www.theodo.co.uk/experts/serverless
Alex White
@agwhi_
@EllerbyBen
Serverless
What is this Serverless thing?
-
Architectural movement
- “allows you to build and run applications and services without thinking about servers” — AWS
- Developers send application code which is run by the cloud provider in isolated containers abstracted from the developer.
- Use 3rd party services used to manage backend logic and state (e.g. Firebase, Cognito)
- A framework with the same name
@EllerbyBen
Why Serverless?
💰 Cost reduction
👷♂️ #NoOps... well LessOps
💻 Developers focus on delivering business value
📈 More scalable
🌳 Greener
@EllerbyBen
Not just Lambda (FaaS)
Lambda
S3
Dynamo
API Gateway
Compute
Storage
Data
API Proxy
Cognito
Auth
SQS
Queue
Step Functions
Workflows
EventBridge
Bus
@EllerbyBen
Power and Flexibility to build...
@EllerbyBen
Optimising Lambda during Development
@agwhi_
@agwhi_
https://levelup.gitconnected.com/aws-lambda-cold-start-language-comparisons-2019-edition-%EF%B8%8F-1946d32a0244
@agwhi_
Improving performance
- Reduce cold starts
- Power tuning
- Architecture/code
@agwhi_
Cold Starts
- Code hasn't been executed in a while
- Scaling up
- Rebalancing across availability zones
- Updating code/config flushes
@agwhi_
Improving Cold Starts
Frequency
Duration
@agwhi_
@agwhi_
Provisioned Concurrency
@agwhi_
Duration
Measuring with x-ray
@agwhi_
@agwhi_
Cold Starts times
- Package size
- Runtime
- Amount of code
- Amount of initialisation work
@agwhi_
Reducing Duration
- Avoid monolithic functions
- Minify code
- Webpack
- Optimise imports
- Only import the parts of the library you're using
- lazyload dependencies that might not be used
HTTP Keep-Alive
- Reuse TCP connections between requests
- Reduce DynamoDB operation from 30ms to 10ms
- Easy to set up
@agwhi_
Power Tuning
@agwhi_
Memory = Power
@agwhi_
Power Tuning
https://github.com/alexcasalboni/aws-lambda-power-tuning
@agwhi_
Power Tuning
- Data-driven cost and performance
optimisation - Available as an AWS Serverless
Application Repository app - Can integrate with CI/CD
@agwhi_
Input
Output
@agwhi_
CPU-bound example
@agwhi_
Architecture/Code
@agwhi_
Distributed Tracing
@agwhi_
Common mistakes
- Fetching more data than you need
- Not using related services well
- Scans in dynamoDB
- Defaulting to synchronous execution
@agwhi_
Moving to Async
Sync
- You pay while your lambda
- Downstream slowdown affects the lambda
- Needs custom code for error handling and retries
Async
- Minimizes cost of waiting
- Queueing separates fast and slow processes
- Managed services provide reliability features
@agwhi_
Do you need a lambda?
- Move orchestration out of your lambdas
- avoid paying for idle time
- Use step functions
- Move data transport out of functions
- "Transform not transport"
- Use VTL when possible
- Access dynamoDB directly
@agwhi_
Parallelise Code
- Make use of promises (in nodejs) to parallelise processing
- AVX2 announced at re:invent 2020
@agwhi_
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."
- Donald Knuth
@agwhi_
Conclusion
Cold Starts
- AWS optimises the first part for us
- What we can change
- Provisioned Concurrency
Power tuning
- Memory = Power
- AWS Lambda Power Tuning
- Cost vs Speed
Architecture/Code
- Move to async
- You don't always need a lambda
- Use parrelisation
@agwhi_
Load Testing
@EllerbyBen
Microservices
@EllerbyBen
How does Serverless Scale Differently?
- Pay-per-use => leading to denial of wallet
- AWS Service Limits (both hard and soft)
- Outpace non-serverless components / third parties
- Cold start impacts for sudden spikes
- Combination of multiple services in a distributed system makes bottlenecks harder to spot
- Regional distribution of traffic
@EllerbyBen
What is Load Testing?
- Different things to different people.
@EllerbyBen
Simulating different concurrent traffic levels on an application to validate its scalability.
Protocol vs Browser
-
2 Types:
- Protocol Based: Simulating at the API level
- Browser Based: Spinning up browsers and simulating interactions with browser elements to trigger realistic protocol-level requests.
- Typically, Serverless Architectures are best tested at the Protocol level for as the scale of testing is usually high and browser simulated testing at this level would be expensive a slow.
- Protocol could be HTTP API requests, or more custom triggering of the AWS SDK Directly
@EllerbyBen
Components of a good load test
- Exact replica of production infrastructure
- Observability tooling
- Repeatable scenarios
- Ability to simulate high load
- Realistic user flows
- Geographic distribution
@EllerbyBen
Example Application: Gamercraft
@EllerbyBen
@EllerbyBen
The Gamercraft platform needed the ability to support a massive volume of users and accommodate traffic spikes during large-scale tournaments and low usage periods
Gamercraft
@EllerbyBen
Gamercraft
@EllerbyBen
What we want test?
- Validate our cost estimates as load increases
- Identify AWS Service Limits that need raising
- Identify AWS Service Limits that can't be raised
- Verify 3rd party and non-serverless components are protected from spikes
- Identify hidden bottlenecks
- Verify impact of regional distribution
@EllerbyBen
How do I start?
@EllerbyBen
🤷♂️
Environment to Test Against
@EllerbyBen
🌎
Isomorphic Ephemeral Load Testing Environments
@EllerbyBen
- 100% serverless architectures can be deployed to short lived environments.
- In "Serverless Flow" we spin up an environment per PR to run integration testing.
- There is 0 mocking, and the architecture is isomorphic to production
- This same approach can be taken for isolated load testing.
* Non-serverless components and 3rd parties add complexity
Metrics
@EllerbyBen
📊
Basic Metrics
@EllerbyBen
- Response times
- Error rates
- Throttles
Know what’s happening
@EllerbyBen
-
The flexibility, distribution and granularity of Serverless architectures makes logging hard.
-
Cloudwatch & XRay are the minimum.
@EllerbyBen
CloudWatch Lambda Insights
@EllerbyBen
Dedicated Observability Service
Load Testing Toolkit
@EllerbyBen
🛠
Artillery
@EllerbyBen
Artillery is a load testing and smoke testing solution for SREs, developers and QA engineers
Artillery - Test Definition
@EllerbyBen
config:
target: "https://shopping.service.staging"
phases:
- duration: 60
arrivalRate: 5
name: Warm up
- duration: 120
arrivalRate: 5
rampTo: 50
name: Ramp up load
- duration: 600
arrivalRate: 50
name: Sustained load
payload:
# Load search keywords from an external CSV file and make them available
# to virtual user scenarios as variable "keywords":
path: "keywords.csv"
fields:
- "keywords"
scenarios:
# We define one scenario:
- name: "Search and buy"
flow:
- post:
url: "/search"
body: "kw={{ keywords }}"
# The endpoint responds with JSON, which we parse and extract a field from
# to use in the next request:
capture:
json: "$.results[0].id"
as: "id"
# Get the details of the product:
- get:
url: "/product/{{ id }}/details"
# Pause for 3 seconds:
- think: 3
# Add product to cart:
- post:
url: "/cart"
json:
productId: "{{ id }}"
artillery run search-and-add-to-cart.yml
But where would we run this from?
A server... 🤮
@EllerbyBen
What if there was another way?
A service that can run code (without us having to managing servers) with support for massive parallel scale?
@EllerbyBen
Serverless-Artillery (slsart)
@EllerbyBen
Combine serverless with artillery and you get serverless-artillery for instant, cheap, and easy performance testing at scale.
Serverless-Artillery (slsart)
@EllerbyBen
Running From Different AWS Account
@EllerbyBen
- We are running our load test using AWS Services. (i.e. Lambda)
- We don't want the load-testing infra to impact limits on our infra under test
- More realistic traffic paths
Committing Expermients
@EllerbyBen
- All tests should be repeatable experiments.
- The context for the test, scenario templates and results should all be committed to the repo.
- Allows future analysis and repeating of experiments.
Components of a good load test
- Exact replica of production infrastructure
- Observability tooling
- Repeatable scenarios
- Ability to simulate high load
- Realistic user flows
- Geographic distribution
- Committed repeatable tests
@EllerbyBen
Conclusion
@EllerbyBen
🌎
📊
🛠
https://slides.com/alexwhite
@EllerbyBen
serverless-transformation
Serverless Optimisation Workshop
@agwhi_
Alex White Joint Presentation: Using observability to scale AWS Lambda
By Ben Ellerby
Alex White Joint Presentation: Using observability to scale AWS Lambda
Serverless architectures on AWS, involving services like AWS Lambda, DynamoDB, Cognito, Step Functions, API Gateway, bring instant scalability when built and configured in the correct way. We’ll look at how AWS Serverless architectures need to be treated differently to ensure optimal scalability and how Serverless tools (like Serverless Artillery) can be used to verify scalability. Not only will we look at achieving scalability, we’ll also look at the tools and techniques to predict and limit the cost of scaling. To bring these topics to life we’ll look at the architecture of 2 live Serverless applications built on AWS for scale and discuss how they were architected, how costs were monitored and kept in line and how serverless load testing was used to verify scalability and catch edge cases.
- 749