POLARS
Revolutionize Your Data Workflow with Polars: The Ultimate Pandas Replacement
Index
Introduction
- No query planning
- Limited multicore algorithms
- RAM management
- Warty missing data support
- Complex group by operations awkward and slow
-
Columnar
-
Standardization
-
Cross Languages
-
Diverse Data Type
-
Nested Data Type
Apache Arrow
Column vs ROW
Core Features and Advantages
Core Features and Advantages
Rust
Rust which gives it C/C++ performance and allows it to fully control performance critical parts in a query engine.
ARROW
Apache Arrow is a cross-language, columnar memory format for big data analytics, enabling efficient data sharing between systems.
Expression
Enables custom computations and transformations for composable operations and better performance.
IO and Lazy mode
QUERY PLAN
- select
- filter
- join
- with_columns
- sort
- groupby | agg
Manipulation/selection
- collect
- fetch
- serialize
- deserialize
- profile
- explain
- over
- str
- dt
- cat
- bin
- list
JOIN
GROUP BY
Expressions
MERGE
LazyFRAME
Real-world Applications
Tick2kline pandas
Tick2kline pandas
Tick2kline polars
Tick2kline polars
Tick2kline
Real-world Applications
Daily data preprocess
Daily data preprocess
Daily data preprocess
Futures Data Clean
Futures Data Clean
tick2kline with rust
tick2kline with rust
tick2kline with rust
tick2kline with rust
tick2kline with rust
Polars Extension with rust
QA
Polars Rust Python
Alternatives compare
QUERY PLAN |
LARGE DATA |
DIRTY DATA |
STRICTER API |
COLUMNAR |
PARALLELIZE |
DISTRIBUTED |
PANDAS | DASK | MODIN | VAEX | CUDF | SPARK | DUCKDB | POLARS |
---|---|---|---|---|---|---|---|
Polars
By yvictor
Polars
Revolutionize Your Data Workflow with Polars: The Ultimate Pandas Replacement
- 186