Serverless Basecalling with AWS Lambda

Piotr Grzesik

dr hab. inż. Dariusz Mrozek, prof. PŚ

Silesian University of Technology

Agenda

  1. Nanopore Sequencing
  2. MinION Nanopore
  3. Serverless Computing
  4. Analysis workflow
  5. Testing environment
  6. Experiments
  7. Summary
  8. Next steps

Nanopore sequencing

Nanopore sequencing - developed by Oxford Nanopore Technologies, it is a process of DNA sequencing that works by monitoring changes to an electrical current caused by DNA strand passing through a nanopore. The signal that is obtained as a result is decoded to specific DNA or RNA sequences. The process of such decoding is called basecalling.

Nanopore sequencing

MinION Nanopore

MinION Nanopore - portable sequencing device, released by Oxford Nanopore Technologies in 2014. It is the first device that enables portable sequencing at affordable price (1000$). It is powered via USB, weights under 100g, which makes it possible to use it as a field device. 

MinION Nanopore

Serverless computing

Serverless computing is a computing paradigm that takes advantage of simple, stateless functions (also called Functions-as-a-service) that offer low maintenance overhead, fault tolerance, support massive parallelism, allocate resources on-demand and can quickly scale both up and down. One additional benefit of this paradigm is that users pay only for actual invocations of functions and not for idle time.

Serverless computing

Serverless computing is also getting more popular in the literature for bioinformatic purposes:

  • sVEP (Serverless Variant Effect Predictor) - Cloud Native Variant Annotation Pipeline - speedup vs traditional VM-based solutions
  • GT-Scan2 - Serverless Target Finder for genome editing - more cost-effective than traditional VM-based solutions
  • Serverless all-against-all Pairwise comparison using Smith-Waterman algorithm - both faster and more cost-effective than traditional VM-based solutions

AWS Lambda (until 2020)

  • Serverless computing platform, introduced in 2014 by Amazon Web Services
  • Out of the box support for Java, .NET Core, Go, Ruby, Python, Node.js. Virtually supports every runtime via custom runtimes introduced in 2018
  • Easy integration with S3, Kinesis, CloudWatch, API Gateway
  • Execution time limit - 15 minutes
  • Up to 3,008 MB of memory per invocation
  • Alternatives: Google Cloud Functions, Microsoft Azure Functions, Apache OpenWhisk

Why Lambda for bioinformatics?

(Added at the end of 2020)

  • Support for up to 10,240 MBs of memory
  • Support for up to 6 virtual CPU cores
  • Support for Docker containers
  • Support for packages with up to 10 GB in size
  • Support for AVX2 instruction set

Analysis workflow

Analysis workflow

In proposed workflow, first step is uploading FAST5 files from MinION Nanopore to S3 Bucket. In the next step, the processing is triggered manually and first Lambda function splits the FAST5 files into batches and schedules execution of multiple Lambda functions that run basecalling operation and save results to S3 bucket as well.

Software used

  • Guppy - It is state-of-the-art, closed-source, basecaller developed by Oxford Nanopore Technologies, creator of MinION Nanopore device. It offers support for both CPU and GPU acceleration on selected devices, works both on AArch64 and x86-64 architectures. It also supports two different modes - fast and hac, which stands for high accuracy.
  • Bonito - In development, open-sourced, basecaller developed by Oxford Nanopore Technologies. It uses Recurrent Neural Networks to perform basecalling, but currently is in experimental form and does not offer great performance.

Software used

  • Deepnano-blitz - Open-source basecaller, developed by Vlado Boza et al., written in Rust and Python, based on RNNs and uses Beam Search as optimization step. It is heavily optimized for Intel processors that suport AVX2 instructions set. Unfortunatelly, it is not compatible with AWS Lambda runtime and has to be adjusted in order to work as multiprocessing has to be implemented in a specific way for AWS Lambda runtime.

Experiments

  • Fast5 files with data from sequencing runs containing material of Escherichia coli and Klebsiella Pneumoniae

  • Measurement of samples processed per second by each basecaller and per second per MB of memory for different models

  • Both Guppy and Bonito were tested

  • Experiments were run for 256, 512, 1024, 2048, 4096, 6144, 8192 and 10240 (maximum) MBs of RAM available to a single Lambda function

Results

Samples per second processed by Guppy with Fast model

Results

Samples per second per MB of memory for Guppy fast model

Results

Samples per second processed by Guppy with HAC model

Results

Samples per second per MB of memory for Guppy high accuracy model

Conclusions & next steps

  • Guppy CPU basecaller is offering the best performance
  • Other basecallers are either very slow (Bonito) or require significant adjustments (Deepnano-blitz)
  • Takes advantage of recently introduced container support in Lambda
  • Scaled 100 simultaneous functions, each with 6 vCPUs in < 1 minute
  • At most each function is able to process ~600000 samples/s on average, which given theoretical maximum output of MinION (2,300,000 signals/s) allows 3-4 functions to basecall data in near real-time 

 

Conclusions & next steps

  • Ability to quickly scale to >1000 functions allow to massively parallelize processing from multiple MinION Nanopore devices 
  • Further scaling & optimization
  • Evaluation of Deepnano-blitz (AVX2 optimizations)
  • Adding classification step with Kraken2 (some experiments already performed)
  • Evaluation of Serverless Computing in other branches of Bioinformatics

 

Hybrid approach

Hybrid approach

In proposed workflow, first step is to determine if we can take advantage of cloud offloading to speed up the edge processing. Then, the files are splitted into batches for edge and serverless processing, based on the theoretical processing speeds for both approaches, depending on upload speed. Then, the files are processed separately, and edge device monitors and collects the results from cloud-based processing as well.

Experiments

  • Fast5 files with data from sequencing runs containing material of Escherichia coli and Klebsiella Pneumoniae

  • Jetson Xavier NX as edge device, tested with lowest (10W 2 core) and highest (15W 6 core) power modes

  • Guppy basecaller was tested

  • Experiments were run for 128, 256, 512 kB/s upload speeds

  •  

Made with Slides.com