Introduction to Amazon Athena

Jowanza Joseph

@jowanza

www.jowanza.com

Agenda

  • About me ~ 1 minute
  • The mechanics behind Athena ~ 5 minutes
  • Cost structure of Athena ~ 1 minute
  • File format tradeoffs ~ 10 minutes
  • Demo ~ 20 minutes
    • File set up ~ 5 minutes
    • Query API ~ 5 minutes
    • Notebooks ~ 10 minutes

About Me

  • Software Engineer at One Click Retail
  • Spark / Flink: Scala / Java
  • Writing a book: The Apache Spark Field Book
  • gRPC 
  • Cycling / Golf
  • Proud Dad

An Example Big Data Stack

Compute & Storage

How Athena Works

Features

  • Serverless
  • Files stored on S3
  • Pay-per-query
  • SQL API
  • JDBC Connector
  • File format control

Cost Structure

$5 per Petabyte Scanned

File Formats

File Formats Matter

Compression

Basic Cheat Sheet

Demo

Made with Slides.com