The Distance Dilemma: Measuring and Mitigating Postgres Network Latency

Network latency is a simple reality of physics: communications between clients and servers must travel through the infrastructure of the Internet and are absolutely limited by the speed of light. For example, it would take approximately 10ms for light to travel from San Francisco to Denver and back — there is no way for a client in San Francisco to exchange messages with a server in Denver in less than 10ms.

The speed of light imposes a very real constraint on the speed of network communications, including messages exchanged between SQL clients and Postgres servers.

You can see how long it takes your device to communicate with servers in various Google cloud regions here. The latency between your device and each server will depend on how far away that server is. Similarly, communications between SQL clients and Postgres servers are not instantaneous — the latency depends on the distance between the client and the server.

Latency can pose serious practical issues, particularly in industries like finance and e-commerce where rapid data processing is essential. In the financial industry, for example, delays in accessing data can result in missed opportunities or losses for trading firms. In e-commerce, slow database performance can cause delays in loading pages and processing transactions, leading to frustrated customers and lost sales.

In this post, we demonstrate some of the ways in which network latency affects connecting to and querying Postgres databases. We also show some ways to mitigate these issues with techniques like pipeline mode and copy so data processing doesn’t get in the way of important decisions or delay business-critical transactions.

Network Latency affects Connections to Postgres Database Servers

Latency affects parts of the Postgres connection and querying lifecycle differently. Establishing the initial connection to the database is often the most time-consuming single part of the process and inevitably requires some back-and-forth communication between the client and the Postgres server. There are several steps required to initialize a connection between a Postgres client and server, each of which may require one or more messages sent between the client and the server. Some of these steps include:

Negotiating the use of SSL
Requesting/Providing authentication details such as passwords
Accepting/Rejecting authentication attempts
Providing settings/run-time parameters to the server
Telling the client that the server is ready to accept queries

Some of the time spent establishing a connection has nothing to do with network latency: the server needs to do some work to prepare to accept queries. But network latency does significantly impact connection time, and is compounded by each message sent back and forth.

Network latency increases with distance. We tested the time it took to connect to a bit.io database in Iowa from different locations across the U.S. We found that the median connection time for a client in Virginia was about 150 ms compared to 70 ms for a client in Iowa and 250 ms in California (figure 1).

Figure 1: Connecting to a Postgres database requires the exchange of numerous messages between the client and the server. Each of those exchanges is subject to network latency, and they add up, especially with distance.

While the geographical differences in connection time are significant, most users are worried about latency as it affects queries, not initial connections. A client only needs to connect to a Postgres server once at the beginning of a set of operations. Those operations, however, may require the exchange of many messages between the client and the server, resulting in a lot of time lost to latency.

Network Latency Affects Queries

Like establishing connections, executing queries can require a lot of back-and-forth communication between the client and the server. Unlike connections, queries are often executed in a sequence, with many queries run in a given session. The time lost to network latency increases both with distance and with the types of queries and how they’re run.

Figure 2 shows the time, including network overhead, it takes to execute a simple SELECT 1 query on a bit.io database server in Iowa from clients in different regions. Without network latency, it takes the server less than .015ms (or, if you are Admiral Grace Hopper, about 45 kilometers) to run the query regardless of region. Most of the total execution time is spent in communication between the client and the server, from less than 5ms for a client in Iowa to more than 50ms for a client in Oregon.

Figure 2: As with connections, queries require the exchange of messages between the SQL client and Postgres server. When the client and server are far apart, the network latency is greater. Without network latency, the Postgres server only takes about 0.015ms to execute the `SELECT 1` query.

Network Latency affects Some Queries more than Others

We might expect complex, long-running queries to suffer from network latency. Let’s explore that expectation.

This table on bit.io has a million rows. Running the modestly complex query below, which includes a subquery and some string matching, takes approximately 1880ms, inclusive of network latency (from a client in Minneapolis to a bit.io Postgres server in Iowa).

For a moderately-complex query on a large table, network latency is relatively inconsequential compared to the query processing time.

Running the query with EXPLAIN ANALYZE reveals that it took about 1858ms without network overhead. In other words, network latency accounted for just over 1% of the total time. Use cases that involve periodically executing long-running queries, such as complex analyses on large stored datasets, will likely suffer little from network latency.

Conversely, we might expect very simple queries, which take almost no time at all for the server to execute, to suffer very little from network latency. But even the simplest of queries require at least one message from the client and response from the server, resulting in the latency observed in figure 2. A SELECT 1 query sends one message to the server, which sends several messages in reply as shown in the trace below.

`F` means a message is sent to the server from the client while `B` means a message is sent to the client from the server. In this trace, after sending the initial query, the client must wait for a response from the server.

What happens when we do this twice in a row?

`F` means a message is sent to the server from the client while `B` means a message is sent to the client from the server. Adding a second `SELECT 1` means the client must wait for a response from the server twice. The effects of network latency can scale with each query in a sequence, even if those queries are quite simple.

The client sends the first query, waits for the ReadyForQuery response from the server, and then sends the second query. Both queries incur the full round-trip network latency. Extrapolate this to, say, thousands or millions of operations, and the potential problems become clear. A sequence of 100 simple queries will take a full second to run under 10ms network latency. Those fast-paced finance or ecommerce operations start to look a bit trickier under these circumstances.

This gives us a big hint about what to look out for when it comes to Postgres network latency: sequences of queries, even very simple queries, may require a full round trip between the client and the server for each query in the sequence.

So is it hopeless? If your SQL client isn’t parked next door to your Postgres server, are you doomed to excruciatingly slow data pipelines? No!

Mitigating Network Latency

Inserts, updates, and deletes are commonly run in sequence which, as noted above, can be costly in terms of network latency. We previously wrote about improving the efficiency of populating Postgres tables with Pandas and recommended using COPY instead of separate INSERTs. In the remainder of this post, we’ll cover the results of a similar experiment. We generated 50 rows of synthetic data and inserted them into a bit.io database in Iowa from other locations across the U.S. using three different methods, measuring the times from each location:

Sequence of INSERTs: INSERT each row of data separately.
COPY: Instead of sending multiple INSERT statements, COPY will load all of the rows in a single command, with far less back-and-forth between the client and the server.
Sequence of INSERTs (Pipeline Mode): Pipeline mode significantly reduces latency overhead by allowing clients to send a sequence of queries without waiting for the server’s response to each one.

Figure 3 shows the difference between these methods. Each of these methods involves a sequence of messages exchanged between the client and the server. The y axis shows the time elapsed since the client initiated the query; the x-axis shows the message number. COPY requires few messages and relatively little time. INSERTs with pipeline mode require many more messages but still require relatively little time. All of the messages from the client are sent without waiting on the server’s reply to each one and all of the server’s responses come after. There is only one pause as the client waits for the server’s responses.

Get Daniel Liden’s stories in your inbox

Join Medium for free to get updates from this writer.

As we’d expect at this point, a sequence of INSERTs without any special handling takes a long time to execute: it takes the largest number of messages and the greatest number of round trips.

Figure 3: the method used to insert data makes a huge difference in the number of total messages involved, the number of round trips between the client and the server, and the total execution time (including latency).

Figure 4 shows the time it took for each method to populate a table with 50 rows from clients in different regions to a bit.io database in Iowa.

Figure 4: Network latency is far more severe for a sequence of INSERTs (without pipeline mode) than for inserting data with COPY or by using a sequence of INSERTs in pipeline mode.

The results make it clear that network latency doesn’t have the last word when it comes to query timing. While the sequence of inserts takes a long time from any location, the pipeline- and copy-based methods are fast and latency increases very little with distance as neither method involves much back-and-forth between the client and server.

What Do I Do Now?

In summary, network latency slows down Postgres connections and queries. It doesn’t affect all queries equally. Complex, long-running queries on large, static databases will suffer far less from network latency because the time it takes the server to execute the query far exceeds the time it takes the server and the client to communicate. Sequences of brief, simple queries, on the other hand, can get badly bogged down by the frequent back-and-forth communications between client and server. This isn’t cause for despair: as we showed, in the case of INSERTs, using COPY or pipeline mode radically reduces the impact of network latency.

The comparisons in this post were not comprehensive: there are other approaches to dealing with network latency, from consolidating smaller, discrete queries into larger queries to batching with e.g. psycopg’s executemany(). And specific needs will vary by region: clients on the other side of the world from the Postgres servers they’re querying will inevitably face worse latency, the speed of light being what it is, while clients much nearer to the Postgres servers may not need to worry about latency at all.

Lastly, it is important to test the impact of network latency on Postgres connections and queries because network latency can affect different queries in different ways, and it is sometimes counterintuitive which queries will be most affected by it. By testing and carefully analyzing the performance of different types of queries, developers can identify and mitigate the impact of network latency on their Postgres databases, and ensure that their applications perform optimally.

Appendix

Generating test data

We used the Mimesis python library to generate test data. The following method takes the number of rows and a seed (for reproducibility) and returns a list of dictionaries which can be inserted into a Postgres database using various methods (detailed below).

INSERT methods

We used the psycopg library (psycopg3) for all of the insert operations in this project.

Sequence of INSERTs (no pipeline)

This approach iterates through the dictionaries of data and runs an INSERT statement for each row.

COPY

The COPY method “moves data between Postgres tables and standard file-system files.” It is “optimized for loading large numbers of rows; it is less flexible than INSERT, but incurs significantly less overhead for large data loads.”

Sequence of Inserts (Pipeline Mode)

The Postgres documentation states that:

“Pipeline mode is most useful when the server is distant, i.e., network latency (“ping time”) is high, and also when many small operations are being performed in rapid succession. […] A 100-statement operation run on a server 300 ms round-trip-time away would take 30 seconds in network latency alone without pipelining; with pipelining it may spend as little as 0.3 s waiting for results from the server.”

Note that there are some significant limitations and warnings to be aware of when considering using pipeline mode, including:

Only asynchronous operations using the extended query protocol work with pipeline mode. COPY does not work with pipeline mode.
Pipeline mode generally results in higher memory use by both the client and the server.
Pipeline mode is a feature implemented by Postgres clients, not servers. Not all clients support it and not all implementations are guaranteed to behave the same way. The psycopg docs note that:

The Pipeline mode is an experimental feature. Its behaviour, especially around error conditions and concurrency, hasn’t been explored as much as the normal request-response messages pattern, and its async nature makes it inherently more complex.

Synchronization can be a challenge. Query results can be sent to the client at a synchronization point determined by the client, but the server can also initiate a flush of query results to the client.

Querying from Clients in Different Regions

We set up AWS Lambda Functions to run queries with psycopg from different AWS regions. This enabled us to measure the impact on network latency from different locations. We took measurements periodically (every 5 minutes or every 45 minutes depending on the specific experiment) and then plotted the median latency. We used measurements collected from 2022–12–04 to 2022–12–06.