select *
from read_parquet()
limit 100;
select count(*)
from read_parquet()
limit 100;
-- Inferred schema based on select
describe select *
from read_parquet();
-- Actual parquet file schema
select *
from parquet_schema();
select *
from parquet_metadata();
1. In-memory
2. Persistent
jdbc:duckdb:
jdbc:duckdb:memory:
jdbc:duckdb:/path/to/file.duckdb
create view my_view as
select *
from read_json_auto('...');
select *
from my_view; -- actually reads a file
create table my_table -- copies data to memory
select *
from read_json_auto('...');
select *
from my_table; -- reads data from memory
create view my_view as
select *
from read_json_auto('...');
select *
from my_view; -- actually reads a file
create table my_table -- copies data to DB file
select *
from read_json_auto('...');
select *
from my_table; -- reads data from DB file
Inserts work as expected:
For in-memory mode insert
copies data to in-memory table.
For persistent mode insert
copies data to table in DB file.
+----------------+------------------------------------------------------------------------+
|extension_name |description |
+----------------+------------------------------------------------------------------------+
|arrow |A zero-copy data integration between Apache Arrow and DuckDB |
|autocomplete |Adds support for autocomplete in the shell |
|aws |Provides features that depend on the AWS SDK |
|azure |Adds a filesystem abstraction for Azure blob storage to DuckDB |
|delta |Adds support for Delta Lake |
|excel |Adds support for Excel-like format strings |
|fts |Adds support for Full-Text Search Indexes |
|httpfs |Adds support for reading and writing files over a HTTP(S) connection |
|iceberg |Adds support for Apache Iceberg |
|icu |Adds support for time zones and collations using the ICU library |
|inet |Adds support for IP-related data types and functions |
|jemalloc |Overwrites system allocator with JEMalloc |
|json |Adds support for JSON operations |
|motherduck |Enables motherduck integration with the system |
|mysql_scanner |Adds support for connecting to a MySQL database |
|parquet |Adds support for reading and writing parquet files |
|postgres_scanner|Adds support for connecting to a Postgres database |
|spatial |Geospatial extension that adds support for working with spatial data... |
|sqlite_scanner |Adds support for reading and writing SQLite database files |
|substrait |Adds support for the Substrait integration |
|tpcds |Adds TPC-DS data generation and query support |
|tpch |Adds TPC-H data generation and query support |
|vss |Adds indexing support to accelerate Vector Similarity Search |
+----------------+------------------------------------------------------------------------+
Source
Target
DuckDB
Local machine
EC2 instance
Lambda
1. Single-node processing
2. Sharing
Kinda solvable with duckdb file + sql script