Presentations
Templates
Features
Teams
Pricing
Log in
Sign up
Log in
Sign up
Menu
Lazy Pandas
PyOhio 2013
Lightning Talk
Ron DuPlain
ron.duplain@gmail.com
https://slid.es/rduplain/lazy-pandas
photo credit: mine
Motivation
Preprocess some dataset into one or more tables.
Load the data into pandas.
... without aggressively consuming memory.
Too large to fit into memory?
Use a query to pre-filter data.
Watch for memory consumption in the middleware!
Obstacle
I have tried ...
Solr's CSV writer with pysolr,
direct urlopen, or requests.
Direct SQLAlchemy query iteration.
... and ended up consuming 2GB of memory
from a 40MB CSV file on disk.
(SQLAlchemy is still awesome!)
Code
https://github.com/rduplain/expression_file
Lazy Pandas
By Ron DuPlain
Made with Slides.com
Lazy Pandas
An experiment in lazily loading data into a Python pandas v0.12.0 DataFrame.
4,332
Ron DuPlain
More from
Ron DuPlain