Optical Character Recognition

About Us

  • Archana Iyer & Soham Chatterjee
  • Undergraduate students at SRM University
  • Deep Learning Intern at Saama
  • B. Tech. Electrical and Electronics
  • Next Tech Lab
  • Saama Blog
  • Soham's Blog

What is OCR? 

OCR is the conversion of typed or handwritten text into machine-encoded text. It is one of the hardest problems to solve in computer vision and is still an active area of research with no one standard model.

 

 

Why OCR?

OCR has wide-ranging implications in many industries.

Anywhere there is a need to convert handwritten text into machine-encoded text, OCR can be used to reduce errors and increase speed.

Most Interesting Recent Use Cases

  • Let us first see how ​Coca-Cola is applying OCR in their everyday products
  • OCR is used widely by shipping agencies and post offices to read the addresses written by senders.

 

Where should I begin?

  • After seeing such interesting use cases the first thing that pops into our heads is where shall we also start learning how to do the same-If not, we hope our use case might help you get started
  • OpenCV- It is a image manipulation library
  • MNIST/EMNIST
  • IAM Dataset-Industry standard for OCR

OCR For Patient Form Text

  • There is no good way predict handwritten text yet. 
  • CNN-RNN architectures have far lower levels of accuracy
  • In the medical/Pharma-written text is far greater prevalent
  • Doctor written prescriptions are the next level to explore
  • We have tried to explore fixed form to understand handwritten text 

Process for Solving Our Problem

 Choosing a Deep Learning Model

  • CNN Based Computer Vision Works Best

CNN Recap

Approach 1: Simplify the Problem

  • Detect Individual Letters
  • Use a windowing technique to identify all the letters

Process

  • Get Data - MNIST
  • Clean and Pre-process - Done!
  • Model - Done!
  • Test - Demo
  • Improve?

2nd Approach: Brute Force

  • Train on all words of the English Language
  • Assume a max word length of 10
  • Total Labels - 10^26

Process

  • Model - CNN
  • Clean and Pre-process
  • Get Data

3rd Approach – Read Papers

Instead of classification problem, change it to a binary decision problem.

 

                                   “MOVE”

 

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

0 0 0 0  1 0 0 0 0 0 0 0 1 0  1 0 0 0 0  0 0  1 0 0 0 0

 

Learning Attributes Instead of Words – Unigram Approach

Consider the case of a training set of size 1,000. The word “SLEEP” may appear only twice, but attributes such as “does the word contain the unigram ‘S’ in the first half of the word could occur multiple times – which is great for CNN's.

 

Similar words may confuse the network - Consider the words “KIDS” and “BIDS”. A “KIDS” word image is a negative sample for the “BIDS” category, although a large part of their appearance is shared. This similarity between some categories makes a category based classifier harder to learn, whereas an attributes based classifier uses this to its advantage.

STEP 1: Getting and Cleaning Data

  • IAM Dataset
  • 657 writers contributed samples of their handwriting

    1'539 pages of scanned text

    5'685 isolated and labeled sentences

    13'353 isolated and labeled text lines

    115'320 isolated and labeled words

 Step 2 -Training and Preprocessing 

  • Normalization – Convert to grayscale – Divide each pixel by 255.
  • Label Encoding – One hot vector.
  • Image augmentation

 Step 3-Training

  • Monitor Validation Loss
  • Reduce Learning Rate
  • A word on accuracy

Step 4-Testing and Improving

  • What if the results are not that good?
  • Monitor loss function
  • Why is accuracy so high
  • Should we consider accuracy or some other score?
  • Does not generalize well even though the training and validation accuracy is good.
  • Overfitting
  • After training and testing accuracy is improved, how do we generalize to other areas?
  • Pre-processing of form data
  • What to do with the line
  • How to filter blue values
  • Pre-processing issue, still not working/generalizing well outside of IAM database

 

 

DEMO 

Contact us

  • Feedback- A link to give us a feedback-https://tinyurl.com/dnamethyltalk
  • We would love to hear from you and your responses will be Anonymous!
  • Contact us:
  • Archana Iyer:
    • varchanaiyer139@gmail.com
  • Soham Chatterjee:
    • 96soham96@gmail.com
    • csoham.wordpress.com 

OCR Meetup Talk

By archana iyer

OCR Meetup Talk

  • 779