Icebird: JavaScript Iceberg Reader
Icebird is a library for reading Apache Iceberg tables in JavaScript. It is built on top of hyparquet for reading the underlying parquet files.
Usage
To read an Iceberg table:
const { icebergRead } = await import('icebird') const tableUrl = 'https://s3.amazonaws.com/hyperparam-iceberg/spark/bunnies' const data = await icebergRead({ tableUrl, rowStart: 0, rowEnd: 10, })
To read the Iceberg metadata (schema, etc):
import { icebergMetadata } from 'icebird' const metadata = await icebergMetadata({ tableUrl }) // subsequent reads will be faster if you provide the metadata: const data = await icebergRead({ tableUrl, metadata, })
Demo
Check out a minimal iceberg table viewer demo that shows how to integrate Icebird into a react web application using HighTable to render the table data. You can view any publicly accessible Iceberg table:
- Live Demo: https://hyparam.github.io/demos/icebird/
- Demo Source Code: https://github.com/hyparam/demos/tree/master/icebird
Time Travel
To fetch a previous version of the table, you can specify metadataFileName:
import { icebergRead } from 'icebird' const data = await icebergRead({ tableUrl, metadataFileName: 'v1.metadata.json', })
Authentication
You can add authentication to all http requests by passing a requestInit argument that will be passed to fetch:
import { icebergRead } from 'icebird' const data = await icebergRead({ tableUrl, requestInit: { headers: { Authorization: 'Bearer my_token', }, } })
Supported Features
Icebird aims to support reading any Iceberg table, but currently only supports a subset of the features. The following features are supported:
| Feature | Supported |
|---|---|
| Read Iceberg v1 Tables | ✅ |
| Read Iceberg v2 Tables | ✅ |
| Read Iceberg v3 Tables | ❌ |
| Parquet Storage | ✅ |
| Avro Storage | ✅ |
| ORC Storage | ❌ |
| Puffin Storage | ❌ |
| File-based Catalog (version-hint.text) | ✅ |
| REST Catalog | ❌ |
| Hive Catalog | ❌ |
| Glue Catalog | ❌ |
| Service-based Catalog | ❌ |
| Position Deletes | ✅ |
| Equality Deletes | ✅ |
| Binary Deletion Vectors | ❌ |
| Rename Columns | ✅ |
| Efficient Partitioned Read Queries | ❌ |
| All Parquet Compression Codecs | ✅ |
| All Parquet Types | ✅ |
| Variant Types | ❌ |
| Geometry Types | ❌ |
| Geography Types | ❌ |
| Sorting | ❌ |
| Encryption | ❌ |
