Visual Question Answering (VQA)

Laksh Arora

@techedlaksh

Anthill Inside 2017

About Me

```
Recent Graduate from IPU.
```

Pythonista at heart and has interests in applications of Machine Learning and Computer Vision.

```
Organiser of PyDataDelhi meetups. 
```

Previous talks @ PyDataDelhi community, CSI and other various small meetups.

Independent study on Computer Vision, Deep Learning and RL

Introduction to VQA
Motivation
Dataset
Methodology
Results
Code
Future Work

Agenda

What is VQA ?

Visual

Question

Answering

Predict the answer of a given question related to an image

> 0.25 million images
> 0.75 million questions
~ 10 million answers

Dataset

COCO Dataset

VQA: Common Approach

Is there a bridge ?

Question

Visual Representation

Textual Representation

Merge

Predict Answer

Answer

Word/Sentence Embedding LSTM

CNN

Results

Code on Github

VQA models will answer any question

Visual Dialog

A man and woman are holding umbrellas.

What color is his umbrella?

Black

What about hers?

Multi-Colored

How many people are there in the image?

Thanks !

@techedlaksh
contactme@laksharora.com