Introduction to AWS Machine Learning AI Services for Pythonistas

Piotr Grzesik

What I do ?

What I will be talking about ?

  • What are the benefits of Machine Learning APIs ?
  • Overview of ML APIs services on AWS
  • How to use Machine Learning APIs in Python apps ?

Building your own solution

Using Machine Learning APIs

Benefits of ML APIs

  • Does not require specific ML knowledge (can be used by "regular" developers)
  • Easy to consume via API/SDK
  • Trained on a large dataset
  • Models are constantly re-trained/improved
  • "Infinitely" scalable
  • Pay-per-use

Cons of ML APIs

  • Limited functionality, suitable only for common tasksĀ 
  • Can be much pricier than custom solution in the long run
  • No way to customize/tweak models (there are small exceptions)
  • Vendor lock-in (not really)

Providers

What AWS has to offer ?

  • Rekognition Image/Video
  • Polly
  • Comprehend
  • Transcribe
  • Translate
  • Textract
  • Lex

Rekognition

  • Image and video recognition service
  • Supports processing images from S3 buckets
  • Supports batch and real-time processing
  • 1000 minutes of video analysis and 5000 images per month available in Free Tier
  • Alternatives: Google Cloud Vision API, Azure Computer Vision API

Rekognition Image features

  • Object and scenery detection
  • Facial recognition
  • Facial analysis
  • Face comparision
  • Inappropriate image detection
  • Celebrity recognition
  • Text in image

Rekognition Video features

  • Person identification, tracking and pathing
  • Face recognition
  • Facial analysis
  • Objects, scenes and activities detection
  • Inappropriate video detection
  • Celebrity recognition

Rekognition Image demo

Rekognition use cases

  • Detect unsafe/inappropriate images and videos
  • Indexing images/videos based on scenery and/or emotions
  • Blurring faces on images and videos
  • User verification based on face image
  • License plate verification
  • Reading handwritten text from images

Polly

  • Speech synthesis service
  • Supports near real-time and batch processing
  • 5 million characters for speech or Speech Marks requests per month available in Free Tier
  • Alternatives: Google Cloud Text-To-Speech

Polly features

  • Supports mp3, Vorbis and PCM formats
  • Supports generating "Speech Marks"
  • Supports SSML (Speech Synthesis Markup Language)
  • Supports PLS (Pronounciation Lexicon Specification)

Polly demo

Polly use cases

  • Wordpress Polly plugin
  • Combine Rekognition and Polly to provide audio files from handwritten notes
  • Language learning applications
  • Voicing dialogs in games
  • Creating audiobooks

Comprehend

  • Natural language processing (NLP) service
  • Integrates with Amazon S3 and AWS Glue
  • Supports batch processing
  • Supports English, Spanish, French, German, Italian, Portuguese
  • 5 million characters per month available in Free Tier
  • Alternatives: Google Cloud Natural Language API, Azure Language API

Comprehend features

  • Keyphrase extraction
  • Sentiment analysis
  • Syntax analysis
  • Entity recognition
  • Language detection
  • Topic modeling
  • Medical Named Entity extraction
  • Custom classification models

Comprehend demo

Comprehend use cases

  • Analyze sentiment in hotel reviews
  • Generate tags for articles
  • Group documents by topic
  • Search text based on key phrases or sentiment

Transcribe

  • Automatic speech recognition service
  • Supports Spanish, English, French, Portuguese, Hindi, Korean, German
  • Supports multiple speakers and custom vocabulary
  • Optimized for Telephony Audio
  • Supports channel indentification
  • 60 minutes per month available in Free Tier
  • Alternatives: Google Cloud Speech API, Azure Speech API

Transcribe demo

Transcribe use cases

  • Generate podcast transcripts
  • Generate notes from board meetings
  • Combine with Comprehend for sentiment analysis of speech
  • Generate transcripts of telephone calls
  • Combine with Translate for generation of translated movie subtitles

Translate

  • Text translation service
  • Supports real-time and batch processing
  • Supports Named Entity Translation Customization
  • Automatically detects source language
  • Supports 25 languages, 595 translation combinationsĀ 
  • 2 million characters per month available in Free Tier
  • Alternatives: Google Cloud Translation API, Azure Translator Text API

Translate use cases

  • Automatic translation of customer reviews
  • Automatic translation of company website
  • Automatic translation of emails
  • Translation support in conversational applications
  • Translation capabilities in language learning applications
  • Integrate with Polly to provide audio files in multiple languages

Textract

  • Document analysis service, extracts text and data
  • Supports PDF, PNG and JPG
  • Currently only supports English
  • Support for form and table extraction
  • 1000 pages per month for Detect Document Text API and 100 pages per month for Analyse Document API available in Free Tier
  • Alternatives: Google Cloud Cloud Vision API, Microsoft Azure Form Recogniser API [preview]

Textract use cases

  • Extract text that can be later analyzed with Comprehend
  • Automatically extract key data from paper forms and documents
  • Combine it with Polly to provide audio files from processed documents

Lex

  • Service for building conversational interfaces
  • Supports both voice and text inputs
  • Uses the same technology as Amazon Alexa
  • 10000 text requests and 5000 voice requests per month available in Free Tier
  • Alternatives: Dialogflow, Azure Bot Service

Lex features

  • Speech recognition
  • Natural language understanding
  • Multi-turn conversations
  • Confirmation and error prompts
  • Intent chaining

Lex use cases

  • Customer support Messenger bots
  • Learning bots that asks questions and verify answers
  • Bots for adding voice control to connected devices e.g. smart bulbs

Summary

If you want to...

  • analyze images and videos, use Rekognition
  • generate audio from text, use Polly
  • analyze text, use Comprehend
  • recognize speech, use Transcribe
  • translate text, use Translate
  • extract text from documents, use Textract
  • build a chatbot, use Lex

Thanks!

@p_grzesik

pj.grzesik@gmail.com

Introduction to AWS Machine Learning Application Services for Pythonistas

By progressive

Introduction to AWS Machine Learning Application Services for Pythonistas

Recently, Machine Learning is rapidly gaining popularity and are being used for a variety of applications. What if you'd like to enchance your Python app with AI-capabilities, but don't have resources to develop it on your own ? AWS Machine Learning Application Services to the rescue! During my presentation, I will describe what services has AWS to offer when it comes to ML-driven APIs and I'll show how Pythonistas can leverage services like AWS Polly, AWS Rekognition or AWS Transcribe to add image analysis or text to speech conversion to their services.

  • 848