AWS

Lambda Practice

Andrii Chykharivskyi

Solution

Experience

Initially, both for the local and for the pipeline, we used a function that was hosted on AWS.

Solution

Experience

Over time, we ran into a major problems with this approach:

depending on the Internet connection
limited invoking count for free tier

Solution

Experience

What do we need?

should work locally
should work with AWS S3 (Minio locally)

Creating

import os
from src import s3_client
from PyPDF3 import PdfFileReader, PdfFileWriter

def handler(event, context):
    input_s3_bucket_name = event['input_s3_bucket_name']
    input_file_token = event['input_file_token']
    input_file_path = event['input_file_path']

    aws_bucket = s3_client.get_client().Bucket(input_s3_bucket_name)

    local_directory = '/tmp' + os.sep + input_file_token

    os.makedirs(local_directory, exist_ok=True)

    local_path = local_directory + os.sep + 'input.pdf'

    aws_bucket.download_file(input_file_path, local_path)

    local_input_pdf = PdfFileReader(open(local_path, "rb"))

    return {
        "statusCode": 200,
        "pagesCount": local_input_pdf.getNumPages(),
        "generatedFilePath": s3_output_path
    }

Get PDF file pages count

Creating

FROM public.ecr.aws/lambda/python:3.8

# Copy function code
COPY src/app.py ${LAMBDA_TASK_ROOT}

# Install the function's dependencies using file requirements.txt
# from your project folder.
COPY requirements.txt  .
RUN  pip3 install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"
ADD . .

ARG AWS_ACCESS_KEY_ID
ARG AWS_SECRET_ACCESS_KEY
ARG AWS_DEFAULT_REGION
ARG AWS_S3_ENDPOINT_URL

ENV AWS_ACCESS_KEY_ID ${AWS_ACCESS_KEY_ID}
ENV AWS_SECRET_ACCESS_KEY ${AWS_SECRET_ACCESS_KEY}
ENV AWS_DEFAULT_REGION ${AWS_DEFAULT_REGION}
ENV AWS_S3_ENDPOINT_URL ${AWS_S3_ENDPOINT_URL}

# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "app.handler" ]

File Dockerfile

Creating

version: '3.7'

services:
  pdf-pages:
    build:
      context: ./
      dockerfile: Dockerfile
    restart: unless-stopped
    environment:
      AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID:-tenantcloud}
      AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY:-tenantcloud}
      AWS_DEFAULT_REGION: ${AWS_DEFAULT_REGION:-us-east-1}
      AWS_S3_ENDPOINT_URL: ${AWS_S3_ENDPOINT_URL}
    volumes:
      - ./src:/var/task/src
    ports:
      - "${DOCKER_PDF_PAGES_PORT:-3000}:8080"
    networks:
      home:

networks:
  home:
    name: "${COMPOSE_PROJECT_NAME:-tc-lambda}_network"

File docker-compose.yml

Creating

networks:
  home:
    name: "${COMPOSE_PROJECT_NAME:-tc-lambda}_network"

File docker-compose.yml

If we have Minio S3 deployed inside this network, it will be available inside the container of our function

Creating

import os
import boto3

s3_endpoint_url = os.getenv('AWS_S3_ENDPOINT_URL')


def get_client():
    if s3_endpoint_url:
        return boto3.resource('s3',
          endpoint_url=s3_endpoint_url if s3_endpoint_url else None,
          config=boto3.session.Config(signature_version='s3v4'),
          verify=False
        )

    return boto3.resource('s3')

Updated S3 client, that supports Minio

Creating

version: '3.7'

services:
  minio:
    image: ${REGISTRY:-registry.tenants.co}/minio:RELEASE.2020-10-28T08-16-50Z-48-ge773e06e5
    restart: unless-stopped
    entrypoint: sh
    command: -c "minio server --address :443 --certs-dir /root/.minio /export"
    environment:
      MINIO_ACCESS_KEY: ${AWS_ACCESS_KEY_ID:-tenantcloud}
      MINIO_SECRET_KEY: ${AWS_SECRET_ACCESS_KEY:-tenantcloud}
    volumes:
      - ${DOCKER_MINIO_KEY:-./docker/ssl/tc.loc.key}:/root/.minio/private.key
      - ${DOCKER_MINIO_CRT:-./docker/ssl/tc.loc.crt}:/root/.minio/public.crt
    networks:
      home:
        aliases:
          - minio.${HOST}

networks:
  home:
    name: "${COMPOSE_PROJECT_NAME:-tc-lambda}_network"

File docker-compose.minio.yml

Creating

Invoking Docker container function after build

curl -XPOST "http://localhost:3000/2015-03-31/functions/function/invocations" -d '{

"input_s3_bucket_name": "bucketname",

"input_file_token": "randomstring",

"input_file_path": “tmp/original.pdf",

http://localhost:3000/2015-03-31/functions/function/invocations

Port from docker-compose.yml

Creating

S3 dependency

Previously, we used a separate bucket for each function.

Disadvantages of this approach:

new function for application that use own bucket

Creating

S3 dependency

CI/CD

Build. File .gitlab.ci.yml

image: ${REGISTRY}/docker-pipeline:19.03.15-dind

variables:
  FF_USE_FASTZIP: 1
  FF_SCRIPT_SECTIONS: 1
  COMPOSE_DOCKER_CLI_BUILD: 1
  DOCKER_BUILDKIT: 1
  DOCKER_CLIENT_TIMEOUT: 300
  COMPOSE_HTTP_TIMEOUT: 300
  COMPOSE_FILE: "docker-compose.yml:docker-compose.minio.yml"
  REPOSITORY_NAME: $CI_PROJECT_NAME
  SOURCE_BRANCH_NAME: $CI_COMMIT_BRANCH
  SLACK_TOKEN: $DASHBOARD_SLACK_TOKEN
  COMPOSE_PROJECT_NAME: ${CI_JOB_ID}
  GIT_CLONE_PATH: $CI_BUILDS_DIR/$CI_PROJECT_NAME/$CI_JOB_ID
  SLACK_CHANNEL: "#updates"

stages:
  - build
  - deploy

CI/CD

Build. File .gitlab.ci.yml

Build container:
  <<: *stage_build
  tags: ["dind_pipeline1"]
  extends:
    - .dind_service
  script:
    - sh/build.sh
  rules:
    - if: $CI_PIPELINE_SOURCE == 'push' && ($CI_COMMIT_BRANCH == 'master')
      when: on_success

CI/CD

Build. File sh/build.sh

function getLatestTag() {
    # Get your latest version and increment current version
}

function ecrLogin() {
    #Login to AWS ECR
}

# Login to Production account
ecrLogin "$AWS_ACCESS_KEY_ID" \
  "$AWS_SECRET_ACCESS_KEY" \
  "$AWS_DEFAULT_REGION" \
  "$AWS_ACCOUNT_ID"

function pushContainer() {
  AWS_ACCOUNT_ID=$1
  docker push "${1}.dkr.ecr.us-west-2.amazonaws.com/$REPOSITORY_NAME:${NEW_VERSION}"
}

echo "Building containers..."
docker build \
  --tag "$AWS_ACCOUNT_ID".dkr.ecr."$AWS_DEFAULT_REGION".amazonaws.com/"$REPOSITORY_NAME":"${NEW_VERSION}" .

echo "Pushing container to ECR..."
pushContainer "$AWS_ACCOUNT_ID"

CI/CD

Deploy. File .gitlab-ci.yml

Deploy to production:
  <<: *stage_deploy
  tags: ["dind_pipeline1"]
  needs:
    - job: Build container
  extends:
    - .dind_service
  script:
    - sh/deploy.sh
  after_script:
    - |
      if [ $CI_JOB_STATUS == 'success' ]; then
        echo "$SLACK_TOKEN" > /usr/local/bin/.slack
        slack chat send --color good ":bell: $REPOSITORY_NAME function updated" "$SLACK_CHANNEL"
      fi
  rules:
    - if: $CI_PIPELINE_SOURCE == 'push' && ($CI_COMMIT_BRANCH == 'master')
      when: manual

CI/CD

Deploy. File sh/deploy.sh

#!/bin/bash

# shellcheck disable=SC2034,SC2001

set -e

export AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY
export AWS_DEFAULT_REGION=$AWS_DEFAULT_REGION
export AWS_ACCOUNT_ID=$AWS_ACCOUNT_ID

# Update lambda function
echo "Update lambda function..."
aws lambda update-function-code \
  --function-name "$REPOSITORY_NAME" \
  --image-uri "$AWS_ACCOUNT_ID".dkr.ecr."$AWS_DEFAULT_REGION".amazonaws.com/"$REPOSITORY_NAME""

CI/CD

Versioning

Lambda allows publish one or more immutable versions for individual Lambda functions. Each function version has a unique ARN.

Versioning

Where to find?

Versioning

Lambda allows provide a description for published version.

The Function consumers can continue using a previous version without any disruption.

Advantages:

Versioning

ARN - Amazon Resource Name, unique identifier for resources in the AWS Cloud.

arn:aws:lambda:us-east-1:465951521245:function:print-prime-numbers:1

version

name

region

resource name

When the version is included, it`s called qualified ARN. When the version is omitted is said to be unqualified ARN.

ARN

Versioning

only incremented value (bad for hotfixes)
can`t split routing traffic to different versions

Disadvantages:

Versioning

Don`t forget that published function is fully immutable(including configuration)

Reminder.

Versioning

Aliases.

each alias points to a certain function version
can be random string or random numeric
cannot point to another alias, only to a function version
useful in routing traffic to new versions after proper testing

Versioning

Aliases.

Versioning

Realization.

# Update lambda function
echo "Update lambda function..."
aws lambda update-function-code \
  --function-name "$REPOSITORY_NAME" \
  --image-uri "$AWS_ACCOUNT_ID".dkr.ecr."$AWS_DEFAULT_REGION".amazonaws.com/"$REPOSITORY_NAME":"${CURRENT_VERSION}"

Versioning

Realization. Blocking.

# Waiting for successfully deployment
echo "Waiting for successfully update lambda function..."
aws lambda wait function-updated --function-name "$REPOSITORY_NAME"

You cannot publish new version when function updating status "in progress"

Versioning

Realization. Publishing.

# Publish new lambda function version
echo "Publish new lambda function version..."
aws lambda publish-version \
  --function-name "$REPOSITORY_NAME" \
  --description "${CURRENT_VERSION}"

# Create new alias
echo "Create new lambda function alias..."
FUNCTION_VERSION=$(echo "$CURRENT_VERSION" | sed 's/^.//')
aws lambda create-alias \
    --function-name "$REPOSITORY_NAME" \
    --description "$CURRENT_VERSION" \
    --function-version "$FUNCTION_VERSION" \
    --name "$CURRENT_VERSION"

Versioning

Traffic shifting

Versioning

Traffic shifting

Pluses:

easy implementing Rolling or Canary deployment strategy (current and new versions can coexist, each receiving traffic)
development team can identify any possible issues before active using

Credits to Maksym Bilozub and Viktor Fedkiv

Andrii Chykharivskyi