The 1600 columns limit in PostgreSQL - how many columns fit into a table

4 min read Original article ↗

A recent blog posting by Frédéric Delacourt (Did you know? Tables in PostgreSQL are limited to 1,600 columns) reminded me once again that in the analytics world customers sometimes ask for more than 1600 columns.

Quick recap: in OLTP, the aim is (usually) to use the 3rd normal form. In OLAP, tables are often only vaguely normalized, and wide or very wide fact tables in 2nd normal form are quite common. But are 1600 columns a bad idea? Yes. Do some applications generate such wide tables? Also yes. I’ve seen my fair share of customer requests and support tickets asking if the 1600 columns limit can be raised or even lifted.

But is that possible?

Why is there a limit

In PostgreSQL, a single row must fit into a single page on disk. The disk page size, by default, is 8 kB. As Frédéric shows in tests in his blog posting, sometimes even a smaller number of columns does not fit into the page.

Now my analytics background is not only with PostgreSQL. but also with WarehousePG (a Greenplum fork) and with Greenplum itself. In WarehousePG the default page size is 32 kB. Will this increase the number of columns? Unfortunately not:

1
2
3
~/sources/warehouse-pg main]$ grep -e MaxTupleAttributeNumber -e MaxHeapAttributeNumber src/include/access/htup_details.h
#define MaxTupleAttributeNumber 1664    /* 8 * 208 */
#define MaxHeapAttributeNumber  1600    /* 8 * 200 */

The fork is still using the same values for MaxTupleAttributeNumber and MaxHeapAttributeNumber, limited to 1600 columns. There’s also a comment near MaxHeapAttributeNumber in src/include/access/htup_details.h:

1
2
3
 * In any case, depending on column data types you will likely be running
 * into the disk-block-based limit on overall tuple size if you have more
 * than a thousand or so columns.  TOAST won't help.

Is it possible to increase the limit

It is possible to increase these limits, and create tables with a couple thousand columns. Theoretically, a single page fits 8136 single byte columns (like a BOOLEAN) in PostgreSQL. In WarehousePG this even fits 32712 single byte columns. But that is not the real limit.

The HeapTupleHeader has the t_infomask2 field, which is a uint16 (unsigned integer), defined in access/htup_details.h. Out of the available bits, 11 are used for the number of attributes:

1
#define HEAP_NATTS_MASK			0x07FF	/* 11 bits for number of attributes */

And 11 bits is 2047 attributes. Any tuple can have a maximum of 2047 attributes, even with all the 1600 safeguards increased or removed. In practice, it’s 2041 attributes. When inserting/updating a table, the database will not write more than those 2041 columns, all other columns are not set. If the column definition of the higher columns is NOT NULL, the INSERT or UPDATE fails with a constraint violation. Otherwise the higher columns ares simply set to NULL.

Bottom line: while the table can have many more columns, the database can’t write anything into these additional columns. Not without fully refactoring the way tuples are created internally.

Conclusion

In theory it is possible to raise the 1600 columns limit to a slightly larger number. In practice it is not worth the small gain, and is pushing internal safety boundaries built into the database.

Also in practice this will have all type of mostly unintended side effects and problems. This is untested territory, all unit tests must be updated as well. Tools like psql have a built-in limitation as well, which also must be raised. This in turn requires always using the patched binary, it might no longer be possible to use a “standard” psql against this database. Other tools might have problems as well with very wide tables.

Exporting the data is possible, but the table can no longer be imported into an unpatched version of the database. This basically creates a fork of a fork, which must be maintained and updated for every new minor and major version.

tl;dr: Don’t do this.

Thank you

Thanks to Robert Haas for reviewing the code assumptions about larger number of columns.



Comments

With an account on the Fediverse or Mastodon, you can respond to this post. Known replies are displayed below:

Learn how this is implemented here.


Share: