Making computers "see"

Raúl Roa

Disclaimer

Who's this guy

Chief Technology Officer

WE MAKE COMPUTERS DO AMAZING THINGS...

PUT STUFF WHERE THEY BELONG

MAKE THEM UNDERSTAND

I DO MORE

Chief Technology Officer at Yoyo

Payment facilitator for small businesses


Technical Adviser & Business partner at Digital Reality
Virtual Reality and augmented reality for developing countries.

 

Minor OSS contributor

DevIL, ResIL, fog, Emscripten

 

Some of our customers

what's this talk about?

TERMINATOR... OF COURSE

SERIOUSLY...

IF WE TRIED TO, WE COULD ACTUALLY BUILD TERMINATORS TODAY

LET'S BREAK IT DOWN

T-1000 FEATURES

  • DOESN'T TALK
  • MORPHS INTO A SMALL DOG
  • IDENTIFIES THE PERSON THINGS OF INTEREST
    • AGE
    • GENDER
    • SENTIMENT
  • PREDICTS WHERE THE PERSON OF INTEREST WILL BE
  • DEPENDING ON CONTEXT IT KNOWS WHAT OBJECTS ARE AND HOW TO USE THEM

WAIT... HOW IS THIS EVEN POSSIBLE?

BECAUSE OF BUZZ WORDS!

"ARTIFICIAL INTELLIGENCE"

"MACHINE LEARNING"

"COMPUTER VISION"

...use the observed image data to infer something about the world.

— Page 83, Computer Vision: Models, Learning, and Inference, 2012.

Overview of the Relationship of Artificial Intelligence and Computer Vision
https://machinelearningmastery.com/what-is-computer-vision/

BUT WHY ARE THESE THINGS IN OUR HANDS NOW?

3 DRIVING FORCES...

  • COMPUTING POWER
     
  • DATA AVAILABILITY
     
  • BETTER ALGORITHMS

What about Python or R

Why should I care? This sounds like computer science related stuff

Well, not really...

Let's do it step by step

First...

To see it is NOT to read.

Rasterization & Color Representation

Illustration of Bitmaps, Image from an ITNEXT Article on Medium

Edge detection & feature extraction

Sobel‘s Method Demonstrated

Hough Transform and Autonomous Driving

Processed Image of Dotted Lines, Image from a Medium Article

Convolutional Neural Networks

Le Net architecture

The Layer Types

  • Convolutional
     
  • Pooling layers
     
  • Rectified Linear Unit (ReLU) layers
     
  • Fully Connected layers

The convolution operation

Convolutional Layer

3x3 Convolutional Layer

Pooling Layer

2x2 Pooling Layer

ReLU Layer

Fully Connected Layer

Image Classification

Forward propagation

Math behind perceptrons

Back propagation

Back Propagation of Artificial Neural Network

Semantic Segmentation

Semantic Segmentation with CASENet, Demo from Youtube

Semantic Segmentation and Mask R-CNN

Semantic Segmentation with Mask R-CNN, Demo from Youtube

Object Detection

YOLO v1, YOLO v2 & YOLO v3 in action

So what about Terminator?

John Connor is a person of interest

WE KNOW HOW HE LOOKS LIKE...

HOW CAN WE TRACK HIM?

HAAR CASCADING IN ACTION

There are many other applications erm... Terminators

OCR

Complex Document Manipulation

Poverty Indexes

Protein Folding Prediction

Self Driving Cars

THE CHALLENGE

CHALLENGES

  • DATA COLLECTION
     
  • DATA CURATION
     
  • DATA QUALITY
     
  • NO ONE SIZE FITS ALL
     
  • SPEED vs ACCURACY

HASTA LA VISTA BABY!

Making computers "see"

By Raúl G. Roa Gómez

Making computers "see"

  • 229