A SQLAlchemy PostgreSQL dialect for ADBC (Arrow Database Connectivity)
Contents
Installation
pgarrow can be installed from PyPI using pip:
Usage
pgarrow can be used using the postgresql+pgarrow dialect when creating a SQLAlchemy engine. For example, to create an engine for a PostgreSQL database at 127.0.0.1 (localhost) on port 5432 with user postgres and password password:
engine = sa.create_engine('postgresql+pgarrow://postgres:password@127.0.0.1:5432/')
Query returning built-in Python types
To run a query that returns built-in Python types, as is typical with SQLAlchemy:
import sqlalchemy as sa engine = sa.create_engine('postgresql+pgarrow://postgres:password@127.0.0.1:5432/') with engine.connect() as conn: results = conn.execute(sa.text("SELECT 1")).fetchall()
Query returning an Arrow table
To run a query that returns an Arrow table, which should be the most performant for large datasets, you must use SQLAlchemy's driver_connection to access the ADBC-level connection, create a cursor from it to run the query and fetch the table using fetch_arrow_table:
import sqlalchemy as sa engine = sa.create_engine('postgresql+pgarrow://postgres:password@127.0.0.1:5432/') with \ engine.connect() as conn, \ conn.connection.driver_connection.cursor() as cursor: cursor.execute("SELECT 1 AS a, 2.0::double precision AS b, 'Hello, world!' AS c") table = cursor.fetch_arrow_table()
Replace PostgreSQL table with an Arrow table
To insert data into the database from an Arrow table, a similar pattern must be used to use adbc_ingest:
import sqlalchemy as sa engine = sa.create_engine('postgresql+pgarrow://postgres:password@127.0.0.1:5432/') table = pa.Table.from_arrays([[1,], [2,], ['Hello, world!',]], schema=pa.schema([ ('a', pa.int32()), ('b', pa.float64()), ('c', pa.string()), ])) with \ engine.connect() as conn, \ conn.connection.driver_connection.cursor() as cursor: cursor.adbc_ingest("my_table", table, mode="create") conn.commit()
Create a table with SQLAlchemy and append an Arrow table
To create a table using SQLAlchemy, and then append an Arrow table to it:
import sqlalchemy as sa metadata = sa.MetaData() sa.Table( "my_table", metadata, sa.Column("a", sa.INTEGER), sa.Column("b", sa.DOUBLE_PRECISION), sa.Column("c", sa.TEXT), schema="public", ) table = pa.Table.from_arrays([[1,], [2,], ['Hello, world!',]], schema=pa.schema([ ('a', pa.int32()), ('b', pa.float64()), ('c', pa.string()), ])) with \ engine.connect() as conn, \ conn.connection.driver_connection.cursor() as cursor: metadata.create_all(conn) cursor.adbc_ingest("my_table", table, mode="append") conn.commit()
Compatibility
- Python >= 3.10 (tested on 3.10.0, 3.11.1, 3.12.0, and 3.13.0)
- PostgreSQL >= 13.0 (tested on 13.0, 14.0, 15.0, 16.0, 17.0, and 18.0)
- SQLAlchemy >= 2.0.7 on Python < 3.13.0; and >= 2.0.41 on Python >=3.13.0 (tested on 2.0.7 with Python before 3.13.0; and tested on 2.0.41 with Python 3.13.0)
- PyArrow >= 15.0.0 with Python < 3.13, and PyArrow >= 18.0.0 with Python >= 3.13.0 (tested on 15.0.0, 16.0.0, 17.0.0, 18.0.0, 19.0.0, 20.0.0, 21.0.0, and 22.0.0 with Python before 3.13.0; and 18.0.0, 19.0.0, 20.0.0, 21.0.0, and 22.0.0 with Python 3.13.0)
- adbc-driver-postgresql >= 1.9.0 (tested on 1.9.0)