Introduction to AWS Machine Learning AI Services for Pythonistas

Piotr Grzesik

What I do ?

What I will be talking about ?

What are the benefits of Machine Learning APIs ?
Overview of ML APIs services on AWS
How to use Machine Learning APIs in Python apps ?

Building your own solution

Using Machine Learning APIs

Benefits of ML APIs

Does not require specific ML knowledge (can be used by "regular" developers)
Easy to consume via API/SDK
Trained on a large dataset
Models are constantly re-trained/improved
"Infinitely" scalable
Pay-per-use

Cons of ML APIs

Limited functionality, suitable only for common tasks
Can be much pricier than custom solution in the long run
No way to customize/tweak models (there are small exceptions)
Vendor lock-in (not really)

Providers

What AWS has to offer ?

Rekognition Image/Video
Polly
Comprehend
Transcribe
Translate
Textract
Lex

Rekognition

Image and video recognition service
Supports processing images from S3 buckets
Supports batch and real-time processing
1000 minutes of video analysis and 5000 images per month available in Free Tier
Alternatives: Google Cloud Vision API, Azure Computer Vision API

Rekognition Image features

Object and scenery detection
Facial recognition
Facial analysis
Face comparision
Inappropriate image detection
Celebrity recognition
Text in image

Rekognition Video features

Person identification, tracking and pathing
Face recognition
Facial analysis
Objects, scenes and activities detection
Inappropriate video detection
Celebrity recognition

Rekognition Image demo

Rekognition use cases

Detect unsafe/inappropriate images and videos
Indexing images/videos based on scenery and/or emotions
Blurring faces on images and videos
User verification based on face image
License plate verification
Reading handwritten text from images

Polly

Speech synthesis service
Supports near real-time and batch processing
5 million characters for speech or Speech Marks requests per month available in Free Tier
Alternatives: Google Cloud Text-To-Speech

Polly features

Supports mp3, Vorbis and PCM formats
Supports generating "Speech Marks"
Supports SSML (Speech Synthesis Markup Language)
Supports PLS (Pronounciation Lexicon Specification)

Polly demo

Polly use cases

Wordpress Polly plugin
Combine Rekognition and Polly to provide audio files from handwritten notes
Language learning applications
Voicing dialogs in games
Creating audiobooks

Comprehend

Natural language processing (NLP) service
Integrates with Amazon S3 and AWS Glue
Supports batch processing
Supports English, Spanish, French, German, Italian, Portuguese
5 million characters per month available in Free Tier
Alternatives: Google Cloud Natural Language API, Azure Language API

Comprehend features

Keyphrase extraction
Sentiment analysis
Syntax analysis
Entity recognition
Language detection
Topic modeling
Medical Named Entity extraction
Custom classification models

Comprehend demo

Comprehend use cases

Analyze sentiment in hotel reviews
Generate tags for articles
Group documents by topic
Search text based on key phrases or sentiment

Transcribe

Automatic speech recognition service
Supports Spanish, English, French, Portuguese, Hindi, Korean, German
Supports multiple speakers and custom vocabulary
Optimized for Telephony Audio
Supports channel indentification
60 minutes per month available in Free Tier
Alternatives: Google Cloud Speech API, Azure Speech API

Transcribe demo

Transcribe use cases

Generate podcast transcripts
Generate notes from board meetings
Combine with Comprehend for sentiment analysis of speech
Generate transcripts of telephone calls
Combine with Translate for generation of translated movie subtitles

Translate

Text translation service
Supports real-time and batch processing
Supports Named Entity Translation Customization
Automatically detects source language
Supports 25 languages, 595 translation combinations
2 million characters per month available in Free Tier
Alternatives: Google Cloud Translation API, Azure Translator Text API

Translate use cases

Automatic translation of customer reviews
Automatic translation of company website
Automatic translation of emails
Translation support in conversational applications
Translation capabilities in language learning applications
Integrate with Polly to provide audio files in multiple languages

Textract

Document analysis service, extracts text and data
Supports PDF, PNG and JPG
Currently only supports English
Support for form and table extraction
1000 pages per month for Detect Document Text API and 100 pages per month for Analyse Document API available in Free Tier
Alternatives: Google Cloud Cloud Vision API, Microsoft Azure Form Recogniser API [preview]

Textract use cases

Extract text that can be later analyzed with Comprehend
Automatically extract key data from paper forms and documents
Combine it with Polly to provide audio files from processed documents

Lex

Service for building conversational interfaces
Supports both voice and text inputs
Uses the same technology as Amazon Alexa
10000 text requests and 5000 voice requests per month available in Free Tier
Alternatives: Dialogflow, Azure Bot Service

Lex features

Speech recognition
Natural language understanding
Multi-turn conversations
Confirmation and error prompts
Intent chaining

Lex use cases

Customer support Messenger bots
Learning bots that asks questions and verify answers
Bots for adding voice control to connected devices e.g. smart bulbs

Summary

If you want to...

analyze images and videos, use Rekognition
generate audio from text, use Polly
analyze text, use Comprehend
recognize speech, use Transcribe
translate text, use Translate
extract text from documents, use Textract
build a chatbot, use Lex

Thanks!

@p_grzesik

pj.grzesik@gmail.com

Made with Slides.com