Introduction to AWS Machine Learning AI Services for Pythonistas
Piotr Grzesik
What I do ?
What I will be talking about ?
What are the benefits of Machine Learning APIs ?
Overview of ML APIs services on AWS
How to use Machine Learning APIs in Python apps ?
Building your own solution
Using Machine Learning APIs
Benefits of ML APIs
Does not require specific ML knowledge (can be used by "regular" developers)
Easy to consume via API/SDK
Trained on a large dataset
Models are constantly re-trained/improved
"Infinitely" scalable
Pay-per-use
Cons of ML APIs
Limited functionality, suitable only for common tasksĀ
Can be much pricier than custom solution in the long run
No way to customize/tweak models (there are small exceptions)
Vendor lock-in (not really)
Providers
What AWS has to offer ?
Rekognition Image/Video
Polly
Comprehend
Transcribe
Translate
Textract
Lex
Rekognition
Image and video recognition service
Supports processing images from S3 buckets
Supports batch and real-time processing
1000 minutes of video analysis and 5000 images per month available in Free Tier
Alternatives: Google Cloud Vision API, Azure Computer Vision API
Rekognition Image features
Object and scenery detection
Facial recognition
Facial analysis
Face comparision
Inappropriate image detection
Celebrity recognition
Text in image
Rekognition Video features
Person identification, tracking and pathing
Face recognition
Facial analysis
Objects, scenes and activities detection
Inappropriate video detection
Celebrity recognition
Rekognition Image demo
Rekognition use cases
Detect unsafe/inappropriate images and videos
Indexing images/videos based on scenery and/or emotions
Blurring faces on images and videos
User verification based on face image
License plate verification
Reading handwritten text from images
Polly
Speech synthesis service
Supports near real-time and batch processing
5 million characters for speech or Speech Marks requests per month available in Free Tier
Alternatives: Google Cloud Text-To-Speech
Polly features
Supports mp3, Vorbis and PCM formats
Supports generating "Speech Marks"
Supports SSML (Speech Synthesis Markup Language)
Supports PLS (Pronounciation Lexicon Specification)
Polly demo
Polly use cases
Wordpress Polly plugin
Combine Rekognition and Polly to provide audio files from handwritten notes
Language learning applications
Voicing dialogs in games
Creating audiobooks
Comprehend
Natural language processing (NLP) service
Integrates with Amazon S3 and AWS Glue
Supports batch processing
Supports English, Spanish, French, German, Italian, Portuguese
5 million characters per month available in Free Tier
Alternatives: Google Cloud Natural Language API, Azure Language API
Comprehend features
Keyphrase extraction
Sentiment analysis
Syntax analysis
Entity recognition
Language detection
Topic modeling
Medical Named Entity extraction
Custom classification models
Comprehend demo
Comprehend use cases
Analyze sentiment in hotel reviews
Generate tags for articles
Group documents by topic
Search text based on key phrases or sentiment
Transcribe
Automatic speech recognition service
Supports Spanish, English, French, Portuguese, Hindi, Korean, German
Supports multiple speakers and custom vocabulary
Optimized for Telephony Audio
Supports channel indentification
60 minutes per month available in Free Tier
Alternatives: Google Cloud Speech API, Azure Speech API
Transcribe demo
Transcribe use cases
Generate podcast transcripts
Generate notes from board meetings
Combine with Comprehend for sentiment analysis of speech
Generate transcripts of telephone calls
Combine with Translate for generation of translated movie subtitles
Translate
Text translation service
Supports real-time and batch processing
Supports Named Entity Translation Customization
Automatically detects source language
Supports 25 languages, 595 translation combinationsĀ
2 million characters per month available in Free Tier
Alternatives: Google Cloud Translation API, Azure Translator Text API
Translate use cases
Automatic translation of customer reviews
Automatic translation of company website
Automatic translation of emails
Translation support in conversational applications
Translation capabilities in language learning applications
Integrate with Polly to provide audio files in multiple languages
Textract
Document analysis service, extracts text and data
Supports PDF, PNG and JPG
Currently only supports English
Support for form and table extraction
1000 pages per month for Detect Document Text API and 100 pages per month for Analyse Document API available in Free Tier
Alternatives: Google Cloud Cloud Vision API, Microsoft Azure Form Recogniser API [preview]
Textract use cases
Extract text that can be later analyzed with Comprehend
Automatically extract key data from paper forms and documents
Combine it with Polly to provide audio files from processed documents
Lex
Service for building conversational interfaces
Supports both voice and text inputs
Uses the same technology as Amazon Alexa
10000 text requests and 5000 voice requests per month available in Free Tier
Alternatives: Dialogflow, Azure Bot Service
Lex features
Speech recognition
Natural language understanding
Multi-turn conversations
Confirmation and error prompts
Intent chaining
Lex use cases
Customer support Messenger bots
Learning bots that asks questions and verify answers
Bots for adding voice control to connected devices e.g. smart bulbs
Summary
If you want to...
analyze images and videos, use Rekognition
generate audio from text, use Polly
analyze text, use Comprehend
recognize speech, use Transcribe
translate text, use Translate
extract text from documents, use Textract
build a chatbot, use Lex
Thanks!
@p_grzesik
pj.grzesik@gmail.com
Made with Slides.com