Duckhouse: DuckDB + Iceberg + Flight
DuckDB has become well-known as a lightweight, portable, and fast OLAP database.
While it excels as an embedded engine, could we push its boundaries further?
Could we build an actual data platform centered around DuckDB?
This is the idea behind Duckhouse:

Check the full article here
Getting Started
Installing Dependencies
Running the Flight Server
uv run iceberg_over_flight.py serve -w warehouse -p 8816
Ingest data
curl https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2023-01.parquet -o /tmp/yellow_tripdata_2023-01.parquet uv run ingestion/ingestion.py
Run dbt
cd dbt_xorq_project export PYTHONPATH="$PWD:$PYTHONPATH" dbt run
Supported Operations
- Reading and writing Iceberg tables with Flight Server
-
dbt runusing Flight plugin - Filtering and column selection