GitHub - michalc/pgarrow: A SQLAlchemy PostgreSQL dialect for ADBC: postgresql+pgarrow

A SQLAlchemy PostgreSQL dialect for ADBC (Arrow Database Connectivity)

Installation

pgarrow can be installed from PyPI using pip:

Usage

pgarrow can be used using the postgresql+pgarrow dialect when creating a SQLAlchemy engine. For example, to create an engine for a PostgreSQL database at 127.0.0.1 (localhost) on port 5432 with user postgres and password password:

engine = sa.create_engine('postgresql+pgarrow://postgres:password@127.0.0.1:5432/')

Query returning built-in Python types

To run a query that returns built-in Python types, as is typical with SQLAlchemy:

import sqlalchemy as sa

engine = sa.create_engine('postgresql+pgarrow://postgres:password@127.0.0.1:5432/')

with engine.connect() as conn:
    results = conn.execute(sa.text("SELECT 1")).fetchall()

Query returning an Arrow table

To run a query that returns an Arrow table, which should be the most performant for large datasets, you must use SQLAlchemy's driver_connection to access the ADBC-level connection, create a cursor from it to run the query and fetch the table using fetch_arrow_table:

import sqlalchemy as sa

engine = sa.create_engine('postgresql+pgarrow://postgres:password@127.0.0.1:5432/')

with \
        engine.connect() as conn, \
        conn.connection.driver_connection.cursor() as cursor:

    cursor.execute("SELECT 1 AS a, 2.0::double precision AS b, 'Hello, world!' AS c")
    table = cursor.fetch_arrow_table()

Replace PostgreSQL table with an Arrow table

To insert data into the database from an Arrow table, a similar pattern must be used to use adbc_ingest:

import sqlalchemy as sa

engine = sa.create_engine('postgresql+pgarrow://postgres:password@127.0.0.1:5432/')
table = pa.Table.from_arrays([[1,], [2,], ['Hello, world!',]], schema=pa.schema([
    ('a', pa.int32()),
    ('b', pa.float64()),
    ('c', pa.string()),
]))

with \
        engine.connect() as conn, \
        conn.connection.driver_connection.cursor() as cursor:

    cursor.adbc_ingest("my_table", table, mode="create")
    conn.commit()

Create a table with SQLAlchemy and append an Arrow table

To create a table using SQLAlchemy, and then append an Arrow table to it:

import sqlalchemy as sa

metadata = sa.MetaData()
sa.Table(
    "my_table",
    metadata,
    sa.Column("a", sa.INTEGER),
    sa.Column("b", sa.DOUBLE_PRECISION),
    sa.Column("c", sa.TEXT),
    schema="public",
)
table = pa.Table.from_arrays([[1,], [2,], ['Hello, world!',]], schema=pa.schema([
    ('a', pa.int32()),
    ('b', pa.float64()),
    ('c', pa.string()),
]))

with \
        engine.connect() as conn, \
        conn.connection.driver_connection.cursor() as cursor:

    metadata.create_all(conn)
    cursor.adbc_ingest("my_table", table, mode="append")
    conn.commit()

Compatibility

Python >= 3.10 (tested on 3.10.0, 3.11.1, 3.12.0, and 3.13.0)
PostgreSQL >= 13.0 (tested on 13.0, 14.0, 15.0, 16.0, 17.0, and 18.0)
SQLAlchemy >= 2.0.7 on Python < 3.13.0; and >= 2.0.41 on Python >=3.13.0 (tested on 2.0.7 with Python before 3.13.0; and tested on 2.0.41 with Python 3.13.0)
PyArrow >= 15.0.0 with Python < 3.13, and PyArrow >= 18.0.0 with Python >= 3.13.0 (tested on 15.0.0, 16.0.0, 17.0.0, 18.0.0, 19.0.0, 20.0.0, 21.0.0, and 22.0.0 with Python before 3.13.0; and 18.0.0, 19.0.0, 20.0.0, 21.0.0, and 22.0.0 with Python 3.13.0)
adbc-driver-postgresql >= 1.9.0 (tested on 1.9.0)