Mind-blowing: PostgreSQL Meets ScyllaDB’s Lightning Speed and Monstrous Scalability

Press enter or click to view image in full size

Background

Have you ever browsed for a hotel online, and when you clicked on it, the price suddenly changed or it became sold out?

It might leave you wondering — how do online travel agencies (OTAs), like us at tiket.com, manage inventory in such a fast-paced industry, where prices and availability can change very rapidly?

Well, (spoiler alert!) that’s not a small feat. Especially when you have a vast amount of inventories. Like in our case, we are dealing with 2.7 billion database rows, consisting of pricing and availability data for every hotel, room, and date — up to a year in advance. And this massive amount of data is changed extremely rapidly, because as the transaction happens across the world, or when the price fluctuates, the updated state needs to be propagated across all integrated parties.

Problem

And the problem is, this huge data and massive traffic need to be processed and updated in real time. Because even a few minutes of delay can lead to bad user experiences, like price mismatches, overbooking, and inconsistency.

Unfortunately, we have a major bottleneck in our Postgres in writing this massive data. Despite running on robust hardware, we were experiencing delays of up to 4 hours for the data to be reflected in our apps. This makes our product team unhappy, so we try a bunch of workarounds to overcome this issue, such as:

Implementing CDC using Redis, so only different data will be processed.
Implement priority queuing where we prioritize certain data over the others, like top-selling hotels or suppliers, close booking dates, shorter stay periods, etc.
Improve query and indexing, upgrade Postgres version from 9.x to 12, denormalize the tables, do table partitioning, fine-tune the Postgres config according to PG Tune (https://pgtune.leopard.in.ua/), etc.

Press enter or click to view image in full size

Current architecture

Even after implementing these optimizations, the core bottleneck remained unresolved, the performance is still unpredictable, and the delay is still not acceptable. The sheer scale of our dynamic data and the write-intensive nature of our workload continued to outpace the capabilities of Postgres. And as the company planned to expand and get ready to onboard more and more inventories, this made us rethink keeping using the Postgres any longer.

Solution

PostgreSQL’s inability to handle our growing write-intensive workloads and real-time demands made it clear that a change was necessary. So finally, we decided to say goodbye to Postgres and find a more scalable technology. After considering the consistency aspect, performance, and scalability, we have collected several options on the table, like Google Spanner, Elasticsearch, Cassandra, and Aerospike. And after doing some research and comparison on those options, we initially decided to go with Cassandra. But while exploring it, we came across an article from Discord where they migrated from Cassandra to ScyllaDB (link) due to performance issues. This article was a tie breaker for us; after doing deep research on it, we decided not to re-experience such challenge and go directly to ScyllaDB.

So first, we do a performance POC with it. We test both read and write performance with our production load as well as test its limit. And the result is, at first, quite too good to be true. But after repeating the test and verifying it, it turns out it’s indeed mind-blowing. Especially on the write operation, in the load test, we achieved more than 200x higher throughput than we got on Postgres. Although yes, maybe we’re not using Postgres in the most optimum way, but hey, we can achieve such results out of the box with ScyllaDB, so it’s a no-brainer for us; we decided to go with it and bring it to production.

Result

With the following specs:

Postgres Specs:

1 master, 24 cores, 64GB RAM + 1 replica, 32 cores, 128GB RAM

ScyllaDB Specs:

9 nodes, 36 cores, 144 GB RAM

We can achieve the following result:

Write performance

Our biggest challenge with Postgres is its write performance, where we’re dealing with massive traffic with a big volume of data and require real-time updates. So we aim to prioritize approach or implementation that works best for writing while not for reading, and here is the result in our production:

Initially, when rolling out to production, we implemented double writes to both PostgreSQL and ScyllaDB so we can verify the ScyllaDB performance and stability first while still maintain the main traffic to be served by Postgres before fully migrated. With this approach we can also monitor real-time head-to-head performance between the two DBs with the same load for fair comparison.

The below image shows the Kafka lag on Postgres (left) and Scylla (right). And the gap is astonishing, ~7000x lower on Scylla.

Press enter or click to view image in full size

Kafka lag comparison for write API

and resulting in around 70x faster write with a max delay of 1.6h compared to 1.2m, or on average 19.7m to 770ms

Press enter or click to view image in full size

SLA or delay comparison for write API

This is made possible thanks to a very low p99 latency on the DB side, 445 ms vs. 29 ms max, or 195 ms vs. 8.88 ms avg.

Press enter or click to view image in full size

DB throughput and latency comparison for write API

And if we push the traffic to the limit, like we did on perf environment, the gap was even more apparent where ScyllaDB can achieve ~257X higher throughput than Postgre.

Press enter or click to view image in full size

DB throughput comparison (performance test env.)

2. Read Performance

For read operations, though we don’t really have a problem with Postgres in this part because we can scale it horizontally with slave replication, we still get a far better latency in Scylla, up to 12x, even during slightly higher traffic.

Press enter or click to view image in full size

DB latency comparison for search API

And this is also true on our POC in the perf env, when we push max RPM on both Postgres and Scylla, where Postgres can only achieve 10K RPM compared to 160K.

Press enter or click to view image in full size

Throughput comparison for search API (performance test env.)

Moreover, if we’re using a new data model, where it’s optimized for our discovery query patterns, the RPM is mind-blowing; with the same ScyllaDB specs, we can achieve up to 525K RPM, though DB-wise I think we can still push it further, but the other stacks like Redis, the Java apps, Mongo, and others are already struggling, plus the cost was also quite expensive to get to this point.

Press enter or click to view image in full size

Throughput comparison for search API (performance test environment with search-optimized data model)

3. Cost

Despite these huge improvements, we still get another bonus: reducing our costs by approximately 60%. Because of the nature of replication in Postgres, we need to clone our datasets into all replicas, even though we only need some portion of the data. Also, the indexing that we created constitutes a large disk consumption, moreover on the big tables. In contrast, with ScyllaDB, we do sharding, so we split data and load across instances, not replicating it.

So in total, our production environment has 1 master and 3 replicas in Postgres used for different purposes, with a total of 88 cores, 320GB RAM, and 16TB disk. While in ScyllaDB, we only need 9 nodes with a total of 36 cores, 140GB RAM, and 2.3TB of disk. So roughly, we can save up to 60% of the cost.

In addition, we also planned to remove some of the workarounds that piled up the complexity in our system including their tech stacks like redis for CDC or priority queuing because now we can handle that massive traffic with ease, thus further reducing the cost.

Press enter or click to view image in full size

Planned architecture

4. Scalability

Press enter or click to view image in full size

ScyllaDB scallability

Besides the performance and the cost, one of the biggest gains from migrating to ScyllaDB is scalability. Now it’s a lot easier to scale the system. And this is even more crucial as the company planned for expansion very soon and is getting ready to onboard more and more inventories. All the things we need to do are adding more nodes as per needed. In contrast, with Postgres, we can only increase the machine size, which is very limited, or do a hell-level complexity, the darkest nightmare, sharding.

What makes ScyllaDB so good in this case?

After seeing these remarkable results, you’re probably wondering: How can it be so fast, especially when it comes to writing data?
From our research and understanding, there are at least four key factors that contribute to this performance:

LSM Tree

Press enter or click to view image in full size

LSM Tree illustration

The Log-Structured Merge Tree (LSM Tree) is an algorithm and data structure optimized for high write performance, making it more superior for our uce case compared to other commonly used structures in SQL databases. Because writing data in an LSM Tree involves a lightweight and efficient process which is adding the record to append-only log (seq I/O) for durability purpose and to small-sized in-memory table structure and dumping the rest of the complex and heavy process called compaction in the background. This design avoids directly modifying existing data (which is immutable), effectively decoupling user operations from the underlying data structure.

This approach not only reduces write latency but also ensures a consistent and predictable performance profile for write-heavy workloads.

2. Shard per Core Design

Press enter or click to view image in full size

ScyllaDB shard-per-core design

ScyllaDB uses Seastar, which is basically an asynchronous and process management framework to maximize the processing efficiency and performance, especially for multi-core architecture.

So, commonly, when a program is executed, the data is shared across cores, leading to lock contention and context switching, which, from the CPU’s perspective, is both expensive and takes ages. So here, Seastar aims to avoid that by sharding the data per core and using a single-thread-per-core method, like Redis. But because of the shard-per-core design, unlike Redis, which only utilizes a single core regardless of more being available, Seastar can utilize each and every single core on the machine, leading to better resource utilization and more superior scalability and performance. Learn more about this concept here.

3. Internal Memory and Cache Management

ScyllaDB is written in C++ and has very good internal memory management and caching. In our case, the internal cache hit ratio is very good, leading to a very low latency response time for both read (10 ms) and write (4 ms) queries.

Press enter or click to view image in full size

ScyllaDB internal cache hit ratio metrics

4. ScyllaDB’s Upsert Nature is Perfectly Fit for Our Write Behavior

Since we receive dynamic data from the supplier, we cannot determine in advance whether the data they send already exists in our database. This makes it challenging to decide whether to insert new records or update existing ones without first checking the data. However, due to the high traffic volume, performing such checks would severely impact performance and increase the risk of race conditions.

To handle this in Postgres, we have been using the INSERT ... ON CONFLICT DO UPDATE ... query. While this approach ensures data integrity, it is resource-intensive, as it requires checking for the existence of data and verifying the uniqueness of the conflicting keys before performing the main operation.

In contrast, ScyllaDB handles this more efficiently. By default, when we insert data where the primary key already exists, ScyllaDB performs an automatic upsert, thanks to its background compaction process. Since primary keys are unique, data with the same primary key is automatically merged during compaction. ScyllaDB also maintains a timestamp for each column’s value, ensuring that during the merge process, the latest values for each column are retained while older ones are discarded.

That is very powerful for our use case, because we are also supporting partial data update. So we just process whatever piece of data sent by supplier (e.g availability only, or price only, or restrictions only, or combinations of those) without merging it with existing data on the application layer first. So it reduces a lot of load for checking existing data and minimize race condition issues because we do not read the data first.

Automatic Performance Tuning

Honestly, at first, we were worried about its stability and reliability as we’re new to it. We don’t know what problems we might come across in production. Like, will there be unpredictable performance issues, maintenance or operation challenges, or miss configurations for our workload.

On Postgres, we had encountered all of these problems. For instance, we once dealt with severe table bloat caused by manual old-data cleanup in a large, busy table. This resulted in two prolonged and severe downtime (before we implemented table partitioning, where we can simply drop old partitions). We also frequently experienced unpredictable latencies, causing glitches in our applications, or slow performance despite underutilized CPU, even after increasing the specs and tweaking configurations on both database and application side.

But surprisingly, after using ScyllaDB for over three months, we haven’t experienced such problems. We didn’t do much in terms of configuration adjustment or tuning, mostly configured as default, yet it performs well beyond our expectation. The performance is as good and consistent as advertised, resource utilization is very great, plus as a bonus, we can clean up old data automatically using TTL. So it’s very maintenance friendly (big relief for the on-call guys).

I believe this consistent and predictable performance is largely driven by ScyllaDB’s workload prioritization. Where it intelligently prioritizes user-triggered operations over system-triggered ones, like compaction or backups. So these system operations will refrain from occupying resources when they’re actively used by users, thus minimizing its impact to user traffic.

(almost forgot to mention this, thanks to Andrei Manakov for pointing this out)

Trade-offs

Yes, I know — there’s no silver bullet in software engineering. Every solution comes with trade-offs. Now, let’s talk about the caveats.

Unlike PostgreSQL, ScyllaDB is not as general-purpose and covers fewer use cases due to its query limitations. It doesn’t have the luxury of rich query capabilities found in Postgres, such as joins, aggregates, subqueries, geospatial queries, and more. Additionally, ScyllaDB does not fully support ACID properties, making it less ideal in terms of consistency and data integrity. This makes it hard to compete in fields where precision and consistency are critical, such as banking.

These limitations also apply to our use case. We had to shift our paradigm from flexible query patterns to adhere to ScyllaDB’s strict query patterns. Fortunately, we haven’t encountered any dead-end blockers, but we’ve had to implement various workarounds while avoiding features like secondary indexes, materialized views, and multi-table designs that could degrade performance or significantly increase complexity.

For example, we moved data analytics operations away from direct database queries to a separate data pipeline. Similarly, we replaced database transactional rollbacks with manual error-handling logic at the application layer.

While we’ve been lucky enough to avoid major blockers, others might not be as fortunate. This could lead to implementing inefficient or overly complex solutions, ultimately calling into question whether adopting ScyllaDB is worth it. That said, if ScyllaDB is used for the right use case with the right approach, the results can be remarkable and truly game-changing.

Conclusion

The dramatic improvements achieved through our PostgreSQL to ScyllaDB migration highlight the fundamental architectural advantages of modern distributed systems. ScyllaDB’s implementation of LSM tree architecture, combined with its innovative Shard per Core design, delivers performance characteristics that traditional RDBMS systems simply cannot match. While Postgres excels in many use cases, our experience shows that for write-heavy, high-scale operations, the distributed architecture of ScyllaDB provides an entirely different level of performance and scalability. This migration demonstrates how matching database architecture to workload characteristics can yield order-of-magnitude improvements across all performance metrics.

Last, thanks to everyone involved in this fruitful project, especially my manager: Kshitiz Agarwal, my team: Habiburrahman, Handy Wijaya, Zulfahmi Ibnu Habibi, Andree Japutra, Rifky Ekayama, Fery Adwarsyah, Grace Claudia, Haryana Wisnu, Deasy Nathalia Sagita, Fadhlan Hasyim, Michael Daw Balma, Product Team, QA, Perftest Team, and Devops.
Article Contributors: Chatgpt, Claude AI, Gemini, and Me 😂

ScyllaDB Official Website: https://www.scylladb.com/