ReceiptBudget
Managing expenses made easy
Why Keep a Budget?
So that you know where your money goes
And to help you make financial plans
I came to Cluj in 2011, to college.
The first month, I expected to have a lot of expenses.
By the end of the second month:
Where's all my money?
I had X at the beginning of the month and now I have X - X = 0
I had to do something
Version 1:
Not good Enough
Version 2:
Now including some reports!
Pretty...
... boring
and complicated
Version 3
Meet ReceiptBudget
A web app written in Python
Features
- Adding expenses by just taking a photo of the receipt
- all the details, none of the work
- Much more detailed reports
- visualizing expenses on a map
- slicing and dicing data based on month, day, shop
- And backwards compatible with what I had until now
THE OCR ENGINE
- had to be custom built
- receipts are space constrained, so the font is usually very squeezed
- Tesseract (best free alternative) was hit-and-miss - sometimes it worked, sometimes it didn't
- I developed a custom tool just to gather all the data for training
STEPS
- the image is preprocessed
- straightened
- edges removed
- binarized
- uses Random Forests for character segmentation and a linear SVM for character recognition
- accuracy: ~85%
Biggest bottleneck in the pipeline
- if one digit in a date is not recognized correctly
- "23/10/13" => "S3/10/13"
- if one letter in TOTAL is changed, how to tell it's not an item?
- TOTA - brand of professional photography lamps
Built uSING:
All three free, open-source libraries for scientific
computing, machine learning and computer vision
The Dashboard
The goal is to get some insight into spending patterns
If I know when and where I usually spend more money, I can start doing something about it
(at least be more careful when walking past my favorite shop)
THE map
The most expenses were at Kaufland it's to be expected, it's a grocery store
But the second most expensive one is the area around my university
I often go to the nearby store to buy snacks
If I were to buy them somewhere else, I could probably save some money
The charts
Observations
- I spend a lot of money on Mondays
- very little on Saturdays
- There is a peak at the beginning of the month
- probably taxes and rent
- I spend a lot of money in odd places
- the Unknown column, where I didn't write down the place of spending
Built USING
to do
- deep learning for the OCR engine
- Google's doing it, so it must be good
- use a probabilistic classifier instead of a rule based one for understanding line contents
- dashboard should make predictions, not just give reports
Results after experimenting with restricted boltzmann machines
Questions?
ReceiptBudget
By rolisz
ReceiptBudget
- 3,661