Making computers "see"
Raúl Roa

Disclaimer

Who's this guy

Chief Technology Officer
WE MAKE COMPUTERS DO AMAZING THINGS...
PUT STUFF WHERE THEY BELONG

MAKE THEM UNDERSTAND

I DO MORE
Chief Technology Officer at Yoyo
Payment facilitator for small businesses
Technical Adviser & Business partner at Digital Reality
Virtual Reality and augmented reality for developing countries.
Minor OSS contributor
DevIL, ResIL, fog, Emscripten
Some of our customers

what's this talk about?
TERMINATOR... OF COURSE

SERIOUSLY...
IF WE TRIED TO, WE COULD ACTUALLY BUILD TERMINATORS TODAY

LET'S BREAK IT DOWN
T-1000 FEATURES
- DOESN'T TALK
MORPHS INTO A SMALL DOG- IDENTIFIES THE
PERSONTHINGS OF INTEREST- AGE
- GENDER
- SENTIMENT
- PREDICTS WHERE THE PERSON OF INTEREST WILL BE
- DEPENDING ON CONTEXT IT KNOWS WHAT OBJECTS ARE AND HOW TO USE THEM

WAIT... HOW IS THIS EVEN POSSIBLE?
BECAUSE OF BUZZ WORDS!

"ARTIFICIAL INTELLIGENCE"
"MACHINE LEARNING"
"COMPUTER VISION"
...use the observed image data to infer something about the world.
— Page 83, Computer Vision: Models, Learning, and Inference, 2012.

Overview of the Relationship of Artificial Intelligence and Computer Vision
https://machinelearningmastery.com/what-is-computer-vision/
BUT WHY ARE THESE THINGS IN OUR HANDS NOW?
3 DRIVING FORCES...
-
COMPUTING POWER
-
DATA AVAILABILITY
- BETTER ALGORITHMS
What about Python or R?
Why should I care? This sounds like computer science related stuff
Well, not really...

Let's do it step by step
First...
To see it is NOT to read.
Rasterization & Color Representation

Illustration of Bitmaps, Image from an ITNEXT Article on Medium
Edge detection & feature extraction


Sobel‘s Method Demonstrated
Hough Transform and Autonomous Driving

Processed Image of Dotted Lines, Image from a Medium Article
Convolutional Neural Networks

Le Net architecture
The Layer Types
-
Convolutional
-
Pooling layers
-
Rectified Linear Unit (ReLU) layers
- Fully Connected layers
The convolution operation

Convolutional Layer

3x3 Convolutional Layer
Pooling Layer

2x2 Pooling Layer
ReLU Layer

Fully Connected Layer

Image Classification

Forward propagation

Math behind perceptrons
Back propagation

Back Propagation of Artificial Neural Network
Semantic Segmentation

Semantic Segmentation with CASENet, Demo from Youtube
Semantic Segmentation and Mask R-CNN

Semantic Segmentation with Mask R-CNN, Demo from Youtube
Object Detection

YOLO v1, YOLO v2 & YOLO v3 in action
So what about Terminator?
John Connor is a person of interest
WE KNOW HOW HE LOOKS LIKE...

HOW CAN WE TRACK HIM?

HAAR CASCADING IN ACTION
There are many other applications erm... Terminators
OCR


Complex Document Manipulation


Poverty Indexes

Protein Folding Prediction

Self Driving Cars

THE CHALLENGE
CHALLENGES
-
DATA COLLECTION
-
DATA CURATION
-
DATA QUALITY
- NO ONE SIZE FITS ALL
- SPEED vs ACCURACY
HASTA LA VISTA BABY!

Making computers "see"
By Raúl G. Roa Gómez
Making computers "see"
- 229