A pure-Python Database Abstraction Layer. pyDAL generates SQL (or the appropriate query objects for NoSQL backends) in real time using the dialect of the configured back end, so you write Python instead of SQL and the same application runs unchanged against many databases.
Why pyDAL
pyDAL is intentionally not an Object-Relational Mapper. Most ORMs — Django ORM, SQLAlchemy — map tables to Python classes and rows to instances of those classes. pyDAL instead treats rows as plain Python dictionaries (with attribute access for convenience) and keeps the API close to SQL. The result:
- A small, predictable surface. If you know SQL you'll be productive in minutes.
- No declarative class hierarchies, no metaclass magic, no two-phase schema bootstrap.
- The same query DSL runs against ~15 backends. Swap the connection string and the app keeps working.
- Rows are just dicts — easy to serialize, easy to pass around, easy to inspect.
Installation
The only hard dependency is Python ≥ 3.7. The SQLite driver ships with
Python, so the first example below runs out of the box. For other
backends, install the appropriate Python driver (psycopg2, pymysql,
pymongo, …) — pyDAL picks it up automatically.
A first example
from pydal import DAL, Field db = DAL("sqlite://storage.db") db.define_table("thing", Field("name")) db.thing.insert(name="Chair") db.thing.insert(name="Table") for row in db(db.thing.name.startswith("C")).select(): print(row.id, row.name) # 1 Chair db.commit()
Every line of the example above maps directly to a SQL operation:
define_table to CREATE TABLE, insert to INSERT, the call to db(…)
builds a WHERE clause, and .select() runs the query and returns rows.
What's in the box
- Schema definition with explicit field types, validators, defaults, foreign keys, and indexes.
- Migrations: when a table definition changes between runs, the
appropriate
ALTER TABLEis generated and applied. - Transactions with explicit commit/rollback.
- A query DSL covering comparisons, logical operators, string
match, regex, aggregates, date/time accessors,
IN,CASE,COALESCE, and substring expressions. - Inner joins, left outer joins, cross joins, with
.on()syntax or implicit cross-table queries. - Subqueries: in
INclauses, as join sources, or as inline expressions. - Common Table Expressions (CTEs), including recursive CTEs.
- Type-safe parameter binding with placeholders (
?,$1,%s, …) selected per dialect. - Lazy iteration for large result sets (
iterselect). - Built-in CSV import/export per-table or for the whole database.
- A natural-language QueryBuilder that turns
'name starts with C and age >= 18'into a real query.
Supported databases
A non-exhaustive list:
| Database | Driver |
|---|---|
| SQLite | sqlite3 (built-in) |
| PostgreSQL | psycopg2, pg8000 |
| MySQL | pymysql, MySQLdb |
| MSSQL | pyodbc |
| Oracle | cx_Oracle |
| FireBird | kinterbasdb, fdb, pyodbc |
| DB2 | pyodbc |
| Informix | informixdb |
| Ingres | ingresdbi |
| Sybase / SAP | Sybase, sapdb |
| Teradata | pyodbc |
| Snowflake | snowflake-connector-python |
| MongoDB | pymongo |
| Google Firestore | google-cloud-firestore |
| IMAP | imaplib (built-in) |
The DAL API: a guided tour
The DSL is built from seven core objects. Once you've seen them once, the rest of the API is just method calls.
DAL — the connection
db = DAL("sqlite://storage.sqlite")
The constructor accepts a connection string (also called the
uri). Examples for the most common backends:
| Database | Connection string |
|---|---|
| SQLite | sqlite://storage.sqlite or sqlite:memory |
| PostgreSQL | postgres://user:pass@localhost/test |
| MySQL | mysql://user:pass@localhost/test?set_encoding=utf8mb4 |
| MSSQL ≥ 2012 | mssql4://user:pass@localhost/test |
| Oracle | oracle://user/pass@test |
| MongoDB | mongodb://user:pass@localhost/test |
You can also pass None to build a "dry" DAL that generates SQL
without connecting, or do_connect=False to defer connection until
needed.
A few commonly-used DAL parameters:
pool_size— number of pooled connections (default0). Ignored by SQLite.folder— where migration metadata is written. Set this explicitly when using pyDAL standalone with SQLite.migrate— global default for whether table changes generateALTER TABLEstatements (defaultTrue).check_reserved— list of backend names to validate identifiers against (e.g.["postgres", "mssql"]).
Table — a database table
You don't instantiate Table directly; you define it via the DAL:
db.define_table("person", Field("name"), Field("age", "integer"))
This returns a Table object also accessible as db.person. Every
table automatically gets an auto-increment integer primary key called
id unless you explicitly opt out via the primarykey argument.
Some useful Table arguments:
format— a record representation, used for foreign-key display:format="%(name)s"orformat=lambda r: r.name.rname— the real SQL name when the table is known by a different identifier in the database (e.g. a legacy name, a schema-qualified name like"app1.dbo.legacy_table").redefine=True— allow redefining an existing table (triggers a migration if the schema differs).
Field — a column
Field("name", "string", length=80, default="anonymous", required=True)
The default type is "string". Available types:
| Type | Notes |
|---|---|
string |
default length 512 |
text |
default length 32768 |
blob |
binary; default length 2 GiB |
boolean |
|
integer |
32-bit |
bigint |
64-bit |
double |
|
decimal(n, m) |
fixed precision |
date |
|
time |
|
datetime |
|
password |
string with optional hashing validator |
upload |
stores a filename; file is saved on disk |
json |
any JSON-serializable value |
reference <table> |
foreign key to <table> |
list:string |
a list of strings, stored encoded |
list:integer |
a list of integers |
list:reference <t> |
a list of foreign keys |
Field options you'll reach for often: default, notnull, unique,
required, requires=<validator>, compute=<func>, update=<value>,
label, readable, writable, rname.
Query — a WHERE clause
A Query is the result of comparing or combining fields and values:
q = (db.person.age >= 18) & (db.person.name != "anonymous")
Supported operators: ==, !=, <, <=, >, >=, plus methods
like, ilike, regexp, startswith, endswith, contains,
belongs. Combine with & (AND), | (OR), ~ (NOT).
Python's
and/orcan't be overloaded, so you must use&and|— and because they bind tighter than==, the parentheses around each side are required.
Set — a queryable set of records
Calling the DAL with a query produces a Set:
adults = db(db.person.age >= 18)
A Set doesn't run any SQL yet — it just remembers the query. The
real work happens when you call one of its methods:
adults.count() # SELECT COUNT(*) FROM person WHERE … adults.select() # SELECT * FROM person WHERE … adults.update(age=21) adults.delete() adults.isempty()
Set also has _select, _update, _count, _delete (with
underscore) that return the generated SQL string instead of
executing it — handy for inspection, embedding as a sub-query, or
debugging.
Rows — the result of select()
select() returns a Rows object: iterable, indexable, sliceable,
and self-serializing to CSV via str(rows).
rows = db(db.person.age >= 18).select() for row in rows: print(row.id, row.name) len(rows) # number of rows rows[0] # first Row rows.first() rows.last() rows.as_dict() # {id: row, ...} rows.as_list() # [{name: …}, …]
For large result sets, use iterselect() instead — it returns rows
one at a time without loading them all into memory.
Row — a single record
A Row is a dict that also supports attribute access:
row = rows[0] row.name # attribute row["name"] # item row("person.name") # qualified name (useful when join columns collide)
Row methods:
row.update_record(name="Alice") # persists the change to the DB row.delete_record()
update_record is not the same as row.update(...) — the latter
updates only the in-memory dict.
Expression
Many things you'd write in SQL — UPPER(name), age + 1,
SUM(salary), ordering clauses — appear as Expression objects in
pyDAL. You build them with field methods and arithmetic, and use
them anywhere a field is allowed:
total = db.person.salary.sum() db().select(total) db().select(db.person.ALL, orderby=db.person.name | db.person.id) db().select(db.person.name.upper())
Field is itself a subclass of Expression.
Inserting, updating, deleting
Insert
rid = db.person.insert(name="Alex", age=30) # returns the new id db.person.bulk_insert([ # one query, many rows {"name": "Bob", "age": 25}, {"name": "Carl", "age": 42}, ])
update_or_insert writes a new record only if no existing record
matches:
db.person.update_or_insert(db.person.name == "John", name="John", age=30)
validate_and_insert / validate_and_update run the field validators
first and return {"id": …, "errors": {…}, "success": bool}.
Update and delete via a Set
db(db.person.age < 18).delete() # returns number deleted db(db.person.age >= 18).update(adult=True) # returns number updated
Update values can be expressions:
db(db.person.name == "Alex").update(visits=db.person.visits + 1)
Shortcuts
person = db.person[42] # → Row with id=42, or None db.person[42] = {"name": "Alice"} # update db.person[None] = {"name": "Alice"} # insert del db.person[42] # delete
Selecting
The basic shape:
rows = db(query).select(*fields, **options)
Field lists work like a SQL SELECT clause:
db().select(db.person.ALL) # all columns db().select(db.person.id, db.person.name) # specific columns db(db.person).select(db.person.name) # query is just the table
Options
| Option | Effect |
|---|---|
orderby= |
ORDER BY. Use ~field for DESC. |
groupby= |
GROUP BY |
having= |
HAVING (with groupby) |
limitby=(off, end) |
LIMIT end-off OFFSET off |
distinct=True |
DISTINCT |
distinct=field |
DISTINCT ON (field) (PostgreSQL) |
for_update=True |
FOR UPDATE |
join= |
INNER JOIN (table.on(condition)) |
left= |
LEFT OUTER JOIN |
cache= |
wrap the result in a cache decorator |
Example:
rows = db(db.person.age >= 18).select( db.person.id, db.person.name, orderby=~db.person.age, limitby=(0, 10), )
Joins
The simplest join is implicit — reference fields from two tables in the
query and pyDAL puts them in FROM:
rows = db(db.person.id == db.thing.owner_id).select() for row in rows: print(row.person.name, "owns", row.thing.name)
The explicit form uses table.on(condition):
rows = db(db.person).select( db.person.name, db.thing.name, join=db.thing.on(db.person.id == db.thing.owner_id), )
left= produces a LEFT OUTER JOIN — useful when you want all rows of
the driving table even if the join has no match:
rows = db().select( db.person.ALL, db.thing.ALL, left=db.thing.on(db.person.id == db.thing.owner_id), )
Self-references and table aliases
When you need to join a table to itself (parent/child trees, etc.),
use with_alias:
db.define_table("person", Field("name"), Field("father_id", "reference person")) Father = db.person.with_alias("father") rows = db().select( db.person.name, Father.name, left=Father.on(db.person.father_id == Father.id), )
Operators and expressions
Comparison and logical
db(db.person.age == 21).select() db(db.person.age != 21).select() db((db.person.age > 18) & (db.person.age < 65)).select() db((db.person.name == "Alex") | (db.person.name == "Bob")).select() db(~(db.person.role == "admin")).select()
String matching
db(db.person.name.like("A%")).select() db(db.person.name.ilike("a%")).select() # case-insensitive db(db.person.name.startswith("A")).select() db(db.person.name.endswith("son")).select() db(db.person.name.contains("li")).select() db(db.person.name.regexp("^A.*")).select() # backend-dependent db(db.person.name.upper().like("AL%")).select()
Aggregates
db.person.salary.sum() db.person.salary.avg() db.person.salary.min() db.person.salary.max() db.person.id.count() db.person.name.len()
Use them anywhere a field is accepted:
total = db.person.salary.sum() row = db().select(total).first() print(row[total])
Dates
db(db.log.event_time.year() == 2026).select() db(db.log.event_time.month() >= 6).select() db(db.log.event_time.day() == 15).select() db(db.log.event_time.hour() < 12).select()
belongs / IN
db(db.person.id.belongs([1, 2, 3])).select()
With a subquery (note _select, not select — we want SQL, not rows):
recent = db(db.log.severity == 3)._select(db.log.user_id) db(db.person.id.belongs(recent)).select()
case
condition = db.person.age >= 18 label = condition.case("adult", "minor") rows = db().select(db.person.name, label) for row in rows: print(row.person.name, row[label])
Defaults: coalesce, coalesce_zero
display = db.user.fullname.coalesce(db.user.username) db().select(display) total = db.user.points.coalesce_zero().sum() db().select(total)
Substrings
db().select(db.thing.name[:3]) # first 3 characters db(db.thing.name[:1] == "A").select() # name starts with A
Subqueries
pyDAL offers three ways to build a subquery. All three produce the same result; the AST-native forms are recommended for new code because their bound parameters flow through to the cursor cleanly.
# 1. Recommended: AST-native. sub = db(db.thing.color == "red").subselect(db.thing.owner_id) db(db.person.id.belongs(sub)).select() # 2. Legacy Select object — works as a subquery or as a join source. sub = db(db.thing.color == "red").nested_select(db.thing.owner_id) db(db.person.id.belongs(sub)).select() # 3. Raw SQL string (inline only). sub = db(db.thing.color == "red")._select(db.thing.owner_id) db(db.person.id.belongs(sub)).select()
nested_select is also the way to use a SELECT as a join source —
give it an alias with .with_alias(name) and use it like a table:
sub = db(db.thing.color == "red").nested_select( db.thing.owner_id, db.thing.name ).with_alias("red_things") db(db.person).select( db.person.name, sub.name, join=sub.on(sub.owner_id == db.person.id), )
Common Table Expressions
A CTE — WITH name AS (SELECT …) — is built with set.cte(name, *fields):
recent = db(db.event.created > "2026-01-01").cte( "recent", db.event.id, db.event.user_id ) db(db.user.id.belongs(recent.user_id)).select()
Recursive CTEs use .union(lambda self: …) to add the recursive step:
descendants = ( db(db.org.id == root_id).cte( "descendants", db.org.id, db.org.name, db.org.parent_id, ) .union(lambda descendants: db(db.org.parent_id == descendants.id).nested_select( db.org.id, db.org.name, db.org.parent_id, ) ) ) db().select(descendants.ALL)
Computed and virtual fields
A computed field is calculated on insert/update and stored:
db.define_table("person", Field("first"), Field("last"), Field("full", compute=lambda row: f"{row['first']} {row['last']}"), )
A virtual field is computed every time you access it, from the result of a select — not stored, not queryable, but free:
class PersonMethods: def full(row): return row.first + " " + row.last db.person.full = Field.Virtual("full", lambda row: row.first + " " + row.last)
Common filters
Attach a query to a table and every Set against that table will pick
it up automatically. Useful for soft-delete or tenant isolation:
db.thing._common_filter = lambda q: db.thing.deleted == False
Bypass with db(query, ignore_common_filters=True).
Callbacks
Hook into insert / update / delete events:
db.thing._before_insert.append(lambda fields: ...) db.thing._after_update.append(lambda set, fields: ...) db.thing._after_delete.append(lambda set: ...)
Returning a truthy value from a _before_* callback cancels the
operation.
Validators
A validator is a callable that checks (and often coerces) a value
before it reaches the database. You attach one — or a list — to a
field via requires=:
from pydal.validators import IS_NOT_EMPTY, IS_EMAIL, IS_INT_IN_RANGE db.define_table("person", Field("name", requires=IS_NOT_EMPTY()), Field("email", requires=[IS_NOT_EMPTY(), IS_EMAIL()]), Field("age", "integer", requires=IS_INT_IN_RANGE(0, 150)), )
Validators run when you call validate_and_insert /
validate_and_update (and Form in py4web). A plain insert does
not invoke them — they're meant for input that crosses a trust
boundary. Each validator returns (cleaned_value, error_or_None), so
a string "42" going through IS_INT_IN_RANGE is stored as the int
42.
result = db.person.validate_and_insert(name="", email="bad", age=200) # result == {"id": None, "errors": {"name": "Enter a value", # "email": "Enter a valid email address", # "age": "Enter an integer between 0 and 149"}, # "success": False}
If you don't set requires=, pyDAL installs a default validator
chain appropriate to the field type — IS_LENGTH for strings,
IS_INT_IN_RANGE for integers, IS_DATE for dates, IS_IN_DB for
references, and so on (see pydal/default_validators.py).
Built-in validators
| Validator | Purpose |
|---|---|
IS_NOT_EMPTY() |
non-blank (also strips whitespace) |
IS_LENGTH(maxsize, minsize) |
string / file length bounds |
IS_MATCH(regex) |
regex match |
IS_EQUAL_TO(value) |
exact equality (e.g. password confirmation) |
IS_ALPHANUMERIC() |
letters, digits, underscore |
IS_SLUG(maxlen, check) |
converts to a URL slug |
IS_LOWER() / IS_UPPER() |
case coercion |
IS_INT_IN_RANGE(min, max) |
integer, min <= v < max (exclusive upper) |
IS_FLOAT_IN_RANGE(min, max) |
float, inclusive bounds |
IS_DECIMAL_IN_RANGE(min, max) |
Decimal, inclusive bounds |
IS_DATE(format) |
parses to datetime.date |
IS_TIME() |
parses hh:mm[:ss] [am/pm] to datetime.time |
IS_DATETIME(format, timezone) |
parses to datetime.datetime |
IS_DATE_IN_RANGE(min, max) / IS_DATETIME_IN_RANGE(...) |
date/datetime + bounds |
IS_EMAIL(banned, forced) |
email address, with optional domain allow/deny |
IS_LIST_OF_EMAILS() |
comma- or semicolon-separated email list |
IS_URL(mode, allowed_schemes, prepend_scheme) |
http(s) URL |
IS_IPV4() / IS_IPV6() / IS_IPADDRESS() |
IP addresses |
IS_JSON(native_json) |
parses / validates JSON |
IS_IN_SET(items, multiple, zero, sort) |
value in an explicit list |
IS_IN_DB(dbset, field, label, multiple) |
value is an existing FK |
IS_NOT_IN_DB(dbset, field) |
value is unique (enforces a UNIQUE check) |
IS_LIST_OF(other, minimum, maximum) |
list whose items each pass other |
IS_LIST_OF_STRINGS() / IS_LIST_OF_INTS() |
parses CSV / JSON-list input |
IS_FILE(filename, extension) |
uploaded file name / extension match |
IS_IMAGE(extensions, maxsize, minsize, aspectratio) |
uploaded image checks |
IS_UPLOAD_FILENAME(...) |
legacy; prefer IS_FILE |
IS_SAFE(sanitizer, mode) |
strips/rejects unsafe HTML |
CLEANUP(regex) |
strips control characters |
CRYPT(key, digest_alg, min_length, salt) |
hashes passwords lazily |
IS_STRONG(min, upper, lower, number, special, entropy) |
password complexity |
IS_EXPR(expression) |
arbitrary Python expression (value in scope) |
Combinators
IS_EMPTY_OR(other, null=None)— make any validator optional. Blank input is converted tonull(defaultNone); non-blank input is passed toother. Aliased asIS_NULL_OR.ANY_OF([v1, v2, ...])— succeeds if at least one inner validator passes. Useful when a field accepts more than one shape.
Field("contact", requires=ANY_OF([IS_EMAIL(), IS_IPADDRESS()])) Field("nickname", requires=IS_EMPTY_OR(IS_LENGTH(3, 32)))
Custom validators
Any callable f(value) -> (cleaned, error_or_None) works as a
validator. For richer behavior (translation, record_id-aware
uniqueness checks), subclass Validator and implement validate:
from pydal.validators import Validator, ValidationError class IS_EVEN(Validator): def __init__(self, error_message="Must be even"): self.error_message = error_message def validate(self, value, record_id=None): if int(value) % 2 != 0: raise ValidationError(self.translator(self.error_message)) return int(value) Field("n", "integer", requires=IS_EVEN())
Every validator accepts an error_message= constructor argument to
override the default message. Messages are passed through
Validator.translator if you wire up an i18n hook.
Passwords
CRYPT returns a LazyCrypt object that hashes on demand and knows
how to compare itself with a stored algo$salt$hash string:
db.define_table("user", Field("password", "password", requires=[IS_STRONG(min=10, upper=2, special=2), CRYPT()]), ) db.user.validate_and_insert(password="hunter2-Strong!") stored = db.user[1].password # 'pbkdf2(1000,20,sha512)$...$...' CRYPT()("hunter2-Strong!")[0] == stored # True
Migrations
By default, when you call define_table with a different schema from
last run, pyDAL emits the appropriate ALTER TABLE statements. The
metadata is kept in a small file under folder/ (one per table).
Disable per-table:
db.define_table("legacy", Field("name"), migrate=False)
Disable globally:
db = DAL("...", migrate_enabled=False)
After a destructive schema change, you may need a fake migration — tell pyDAL the current state matches the file without running any DDL:
db.define_table("thing", ..., fake_migrate=True)
CSV import/export
Per-table:
db.thing.export_to_csv_file(open("thing.csv", "w")) db.thing.import_from_csv_file(open("thing.csv"))
Whole database:
db.export_to_csv_file(open("dump.csv", "w")) db.import_from_csv_file(open("dump.csv"))
Natural-language queries: QueryBuilder
Turn an English-ish string into a real query:
from pydal import QueryBuilder builder = QueryBuilder(db.thing) q = builder.parse('name starts with "C" and color == "red"') db(q).select()
Recognized tokens: not, and, or, ==, !=, <, >, <=,
>=, is, is null, is not null, is true, is false,
contains, starts with, belongs, upper, lower. Custom
aliases let you localize the vocabulary or rename fields.
Optional tools
The modules under pydal.tools and the top-level pydal.restapi are
not part of the core DAL — nothing in pydal imports from them. Use
them when they fit, ignore them otherwise. Each one persists state in
DAL-managed tables, so swapping the backend keeps working.
Tagging records: pydal.tools.tags
Tags attaches hierarchical tag paths (color/red, style/modern) to
any table without altering its schema — tags live in a sibling
<tablename>_tag_<name> table that is created on first use.
from pydal.tools.tags import Tags tags = Tags(db.thing) # creates db.thing_tag_default tags.add(thing_id, "color/red") tags.add(thing_id, ["color/red", "style/modern"]) # idempotent
Reading and removing:
tags.get(thing_id) # ["color/red", "style/modern"] tags.remove(thing_id, "color/red")
find returns a Query you pass to db(...). Tag paths support
prefix matching, so find("color") matches every record tagged
color/*:
db(tags.find("color/red")).select() # exactly that tag db(tags.find("color")).select() # any color/* tag db(tags.find(["color/red", "style/modern"])).select() # AND db(tags.find(["color/red", "color/blue"], mode="or")).select() # OR
A single table can carry multiple independent taxonomies by passing
a name to the constructor:
categories = Tags(db.thing, name="categories") flags = Tags(db.thing, name="flags") # creates db.thing_tag_categories and db.thing_tag_flags
Background tasks: pydal.tools.scheduler
A minimal cron-style scheduler that persists task runs in a DAL-managed
task_run table and executes them in forked child processes.
from pydal import DAL from pydal.tools.scheduler import Scheduler, now, delta db = DAL("sqlite://storage.sqlite") scheduler = Scheduler(db, max_concurrent_runs=2, folder="/tmp/scheduler") def send_report(user_id): ... return {"sent": True} scheduler.register_task("send_report", send_report) scheduler.enqueue_run(name="send_report", inputs={"user_id": 42}) scheduler.enqueue_run(name="send_report", inputs={"user_id": 7}, scheduled_for=now() + delta(60)) # in 60s scheduler.enqueue_run(name="send_report", inputs={"user_id": 1}, period=3600) # hourly scheduler.enqueue_run(name="send_report", inputs={"user_id": 9}, priority=-10, timeout=30) # higher prio, 30s cap scheduler.start() # spawns a background loop thread # ... your program continues ... scheduler.stop() # joins the loop thread cleanly
Each call to enqueue_run inserts a row into db.task_run; the loop
picks the next ready row (lowest priority first, then oldest id),
forks a daemon process, and records the outcome:
| Status | Meaning |
|---|---|
queued |
waiting for a worker |
assigned |
claimed by a worker, not yet forked |
running |
child process is executing |
completed |
finished, output column holds the return |
failed |
raised; traceback captured in log |
timeout |
exceeded timeout seconds, killed |
dead |
child process disappeared |
unknown |
enqueued under a name not in register_task |
Inputs and outputs are stored as JSON, so task arguments must be
JSON-serializable and returns must be too (or None). Task stdout/
stderr from the child are captured into the row's log column.
Scheduler constructor parameters:
db— the DAL to persisttask_runinto.max_concurrent_runs— per-worker cap on in-flight children (default2).folder— where per-run log files are buffered (default/tmp/scheduler).sleep_time— seconds to sleep between idle polls (default10).logger— customlogging.Logger(default writes to stdout).
Multiple processes can share the same db and run their own
Scheduler instance — task assignment is race-safe via an
update-with-where check.
JSON REST API: pydal.restapi
RestAPI is a JSON CRUD front-end for any DAL. You hand it a Policy
(what's allowed, on which tables, for which methods), and call it like
an HTTP handler:
from pydal.restapi import RestAPI, Policy policy = Policy() policy.set(tablename="person", method="GET", authorize=True, allowed_patterns=["name.*", "age.*"], limit=200, allow_lookup=True) policy.set(tablename="person", method="POST", authorize=True, fields=["name", "age"]) policy.set(tablename="person", method="PUT", authorize=True) policy.set(tablename="person", method="DELETE", authorize=True) api = RestAPI(db, policy) api("GET", "person", get_vars={"name.startswith": "A", "@limit": 10}) api("GET", "person", id=42) api("POST", "person", post_vars={"name": "Alice", "age": 30}) api("PUT", "person", id=42, post_vars={"age": 31}) api("DELETE", "person", id=42)
Every call returns a JSON-serializable dict with status, code,
timestamp, and api_version; errors are converted to structured
responses (401 policy violation, 404 not found, 400 invalid,
422 validation errors).
Two pre-built policies are shipped: ALLOW_ALL_POLICY (wildcard, all
methods authorized) and DENY_ALL_POLICY (empty).
GET query language. Regular get-vars are field predicates:
field[.subfield][.op]=value
where op is one of eq (default), ne, lt, gt, le, ge,
startswith, contains, in (comma-separated values). Prefix with
not. to negate. Up to four dotted hops traverse reference fields:
api("GET", "thing", get_vars={"owner.name.startswith": "A"})
@-prefixed meta-options control the response shape:
| Meta-option | Effect |
|---|---|
@offset/@limit |
Pagination (capped by policy limit). |
@order |
Comma-separated fields; ~field for DESC. |
@lookup |
Reference traversal — include joined records. |
@model |
Include the table schema in the response. |
@options_list |
Return {value, text} pairs instead of full rows. |
@count |
Include a total count (independent of @limit). |
Policy attributes per (tablename, method):
authorize—True/Falseorf(tablename, id, get_vars, post_vars) -> bool.fields— list of allowed field names (Nonemeans all readable/writable).query— a common filterQueryapplied to every GET (e.g. tenant scoping).allowed_patterns/denied_patterns—fnmatchagainst get-var keys.limit— max value accepted for@limit.allow_lookup— whether@lookup=traversal is honored.
Use tablename="*" as a wildcard fallback for any table not explicitly
listed.
Generating SQL without a database
You can use pyDAL purely as a SQL generator — no Postgres/MySQL driver installed, no database server running. Open an in-memory SQLite (always available, no driver to install), define your schema, then swap the dialect on the existing adapter to render the same queries against the target backend's syntax:
from pydal import DAL, Field from pydal.backends.postgres import PostgresDialect # Always-available "scratch" connection. No external database needed. db = DAL("sqlite:memory", migrate=False) # Retarget SQL emission to PostgreSQL, and ask for inline values # (placeholder-free SQL) — handier for human inspection than the # parameterized form used at runtime. db._adapter.dialect = PostgresDialect(db._adapter) db._adapter.compiler.parameterize = False db.define_table("person", Field("name"), Field("age", "integer")) q = (db.person.age >= 18) & (db.person.name.like("A%")) # The five "_underscore" entry points return SQL strings without # executing anything against the (in-memory) database. print(db(q)._select(db.person.id, db.person.name)) # SELECT "person"."id", "person"."name" FROM "person" # WHERE (("person"."age" >= 18) AND ("person"."name" LIKE 'A%' ESCAPE '\')); print(db(db.person.age < 18)._delete()) # DELETE FROM "person" WHERE ("person"."age" < 18); print(db(db.person.id == 1)._update(name="Alice")) # UPDATE "person" SET "name"='Alice' WHERE ("person"."id" = 1); print(db.person._insert(name="Alice", age=30)) # INSERT INTO "person"("name","age") VALUES ('Alice',30); print(db(db.person.age >= 18)._count()) # SELECT COUNT(*) FROM "person" WHERE ("person"."age" >= 18);
The query AST is dialect-agnostic, so the same Query retargets when
you swap dialects mid-flight — handy for cross-backend comparisons:
q = db.person.name.regexp("^A") # Already swapped to Postgres above: print(db(q)._select(db.person.id)) # SELECT "person"."id" FROM "person" WHERE ("person"."name" ~ '^A'); # Switch to MySQL on the spot: from pydal.backends.mysql import MySQLDialect db._adapter.dialect = MySQLDialect(db._adapter) print(db(q)._select(db.person.id)) # (MySQL-flavored SQL emitted for the same Query object)
Raw SQL escape hatch
When the DSL doesn't cover what you need:
rows = db.executesql("SELECT * FROM thing WHERE name = ?", placeholders=["Chair"])
For inspection without execution, every Set method has an underscore
counterpart that returns the generated SQL:
db(db.thing.name == "Chair")._select() # 'SELECT "thing"."id", "thing"."name" FROM "thing" WHERE ("thing"."name" = ?);'
Ecosystem
pyDAL is a standalone library — drop it into any Python project. It is also the data layer used by py4web, which can automatically generate forms and grids from pyDAL table metadata. If you're building a full web app, py4web saves you a lot of plumbing; if you just need a database layer, pyDAL alone is enough.
Package layout
Two modules hold everything backend-related:
pydal.backend_base— the framework. Defines the four base abstractions (SQLAdapter/NoSQLAdapter,SQLDialect/NoSQLDialect,BasicParserand friends,SQLRepresenter/NoSQLRepresenter/JSONRepresenter), the four registries (adapters,dialects,parsers,representers), and the decorators backends use to register with those registries (sqltype_for,register_expression,for_type,before_parse,repr_for_type,before_type,for_instance,pre).pydal.backends.<name>— one module per database (postgres,mysql,sqlite,mssql,oracle,db2,firebird,informix,ingres,sap,snowflake,teradata,google,mongo,couchdb). Each module contains everything for that backend: the adapter class, the dialect, the parser, and the representer.
The four pieces collaborate as follows:
| Layer | Direction | Owner |
|---|---|---|
| Adapter | session-level — owns the connection | BaseAdapter / SQLAdapter / NoSQLAdapter |
| Dialect | AST node → SQL string | CommonDialect / SQLDialect / NoSQLDialect |
| Representer | Python value → SQL literal | BaseRepresenter / SQLRepresenter / NoSQLRepresenter / JSONRepresenter |
| Parser | driver row value → Python value | BasicParser + per-type *Parser mixins |
When you instantiate DAL("postgres://…"), the URI prefix selects a
registered adapter from adapters; that adapter walks its MRO to pick
the matching dialect, representer, and parser. Per-type behavior is
declared with decorators rather than overrides — for example, a
backend's representer adds @repr_for_type("jsonb") to expose a JSONB
encoder, and a backend's parser adds @for_type("jsonb") for the
matching decoder. (Decorator names are split — for_type for parsers,
repr_for_type for representers — because both subsystems use the
same concept and Python doesn't allow two classes with the same name
in a single module.)
Adding a new backend means writing a single
pydal/backends/<name>.py containing an adapter subclass plus
whichever of dialect/parser/representer override the defaults, then
adding the import to pydal/backends/__init__.py.
License
pyDAL is released under the BSD-3-Clause license. See LICENSE.txt.