Implementing

IIIF By The Numbers

Kevin S. Clarke Anthony Vuong

Digital Library Software Developer Development Support Engineer

<ksclarke@library.ucla.edu> <avuong@library.ucla.edu>

A Tale of Two Architectures

Sinai Palimpsests Project

Uses a "Level 0" IIIF compatible tile server

Californica/Ursus (Hyrax/Blacklight)

Cantaloupe "Level 2" IIIF compatible image server

Servers

Server-oriented architecture treats individual applications (or websites) as the primary consideration.

There may be multiple IIIF servers, each one selected and configured to meet the needs of its application.

Services

Service-oriented architecture provides a functionality to a variety of applications.

"IIIF as a service" means instead of maintaining multiple architectures for serving IIIF images, we build one that meets the needs of multiple applications.

How Do We Make Decisions?

A Metrics Based Approach

Share and reuse work with and from our colleague
- Talked with The Getty (also doing IIIF measurements)
Build / use tools that can help test multiple factors
- docker-cantaloupe works locally or in the cloud
- Used `time`, Locust, CloudWatch, and AWS CLI tools

Image Conversion

Image Delivery

Vertical vs. Horizontal Scaling

Local VM

8 Cores E5-2630 v3 @ 2.40GHz

8 GB memory DDR4 2133MHZ

Compiled version of Kakadu

AWS Lambda

2 Core Lambda function

1024 MB memory

Compiled version of Kakadu

Image Conversion

Our Local Process

Run script for TIFF to JP2 conversion
- Read TIFFs off our NetApp file system
- Have Kakadu convert them into JP2s
Upload the JP2s to Cantaloupe's S3 source bucket
Time how long it takes to process 1000 images

Our Lambda Process

Upload TIFFs into an S3 bucket from NetApp file system
Lambda function is triggered by the bucket event
Kakadu in Lambda function converts TIFF into JP2
Lambda function stores JP2 in Cantaloupe's S3 bucket
Get time it took to process 1000 images from Cloudwatch

Some Questions

What happens when we give our local VM 16 cores? 32?
- How far can we vertically scale using local resources?
How large of an image can we convert on AWS Lambda?
- Its file system is (currently) limited to 512 MB
- Its memory is (currently) limited to 3008 MB

Local VMWare

AWS Fargate

(simple)

Image Delivery

AWS Fargate

(scaled)

VMWare

2 VMs
1 Docker container runner in each VM
Specs of container
- 8GB Memory
- 6 Cores
Total = 16GB Memory / 12 CPU cores
Cantaloupe 4.1.1

AWS Fargate

3 Fargate Containers
Specs of container
- 8GB Memory
- 4 Cores(Fargate max)
Total = 24GB Memory / 12 CPU cores
Cantaloupe 4.1.1

AWS Fargate(Scaled)

10 Fargate Containers
8GB / 4CPU each container
Aggregate specs
- 80GB Memory
- 40 CPUs
Cantaloupe 4.1.1

Delivery Test Case #1

Single Image Fixed Test
- Large Image(110-130MB)
- Medium Image(50-60MB)
- PCT:50 / Full Image Request
IIIF URI used
- /full/pct:50/0/default.jpg?cache=false
- /full/full/0/default.jpg?cache=false

Large Full Image Results

Medium Full Image Results

Large Image Results (50%)

Medium Image Results (50%)

Large Full Multi-Region

Large 50% Multi-Region

Medium Full Multi-Region

Medium 50% Multi-Region

Delivery Test Case #2

Simulated workload with concurrent users
Various regions/tiles of an image
- Large Image(110-130MB)
- Medium Image(50-60MB)
- 20, 50, 100, 200 concurrent users
  - 5-15 seconds wait per user request
1000+ URLs, picked at random
Locust load test left on for 5 minutes

Random - Large TIFF Results

Random Large TIFF Unskewed Results

Random - Large Lossless Results

Random Large Lossless Unskewed Results

Random - Large Lossy Results

Random Large Lossy Unskewed Results

Random - Medium TIFF Results

Random Medium TIFF Unskewed Results

Random - Medium Lossless Results

Random Medium Lossless Unskewed Results

Random - Medium Lossy Results

Discoveries

Generating JPEG derivatives from full TIFF requests are faster than using Kakadu's Native Processor for Lossless and Lossy images
S3 GET speeds appear to be limited to gigabit at the most. Averages from 30-50MB/s vs 400MB/s on local netapp storage
- Makes a huge difference on larger images, not so much smaller images.
- Wait times could cause image downloads to be stacked and bog down container resources
Current assumption is that network bandwidth is the bottleneck rather than compute
Scaling out the Fargate containers allows for more parallel requests to be executed and balance the load across multiple resources

What do we do now?

Decide on Lossy, Lossless, or TIFF as sources
Decide on using on-premise hardware or AWS
For full image pixel requests, latency becomes an issue. Delivery times can vary between on-premise vs AWS
- Do we want to serve full image pixels?
Launch and experiment with a production service
- Gather real work-load use cases
Experiment reducing latency from other countries using AWS CloudFront
Make containers more "production" ready
- More detailed monitoring needed!
Automate all the things!

UCLA Library

Kevin S. Clarke Anthony Vuong

Digital Library Software Developer Development Support Engineer

<ksclarke@library.ucla.edu> <avuong@library.ucla.edu>

IIIF By The Numbers

By Kevin S. Clarke

IIIF By The Numbers

Kevin S. Clarke

ksclarke

Implementing

IIIF By The Numbers

A Tale of Two Architectures

Servers

Services

How Do We Make Decisions?

A Metrics Based Approach

Image Conversion

Image Delivery

Vertical vs. Horizontal Scaling

Local VM

AWS Lambda

Image Conversion

Our Local Process

Our Lambda Process

Some Questions

Local VMWare

AWS Fargate

(simple)

Image Delivery

AWS Fargate

(scaled)

VMWare

AWS Fargate

AWS Fargate(Scaled)

Delivery Test Case #1

Large Full Image Results

Medium Full Image Results

Large Image Results (50%)

Medium Image Results (50%)

Large Full Multi-Region

Large 50% Multi-Region

Medium Full Multi-Region

Medium 50% Multi-Region

Delivery Test Case #2

Random - Large TIFF Results

Random Large TIFF Unskewed Results

Random - Large Lossless Results

Random Large Lossless Unskewed Results

Random - Large Lossy Results

Random Large Lossy Unskewed Results

Random - Medium TIFF Results

Random Medium TIFF Unskewed Results

Random - Medium Lossless Results

Random Medium Lossless Unskewed Results

Random - Medium Lossy Results

Discoveries

What do we do now?

UCLA Library

IIIF By The Numbers

More from Kevin S. Clarke