Settings

Theme

Change Data Capture (CDC) Tools should be database specialized not generalized

1 points by cauchyk a year ago · 1 comment · 1 min read


Reference: https://x.com/saisrirampur/status/1824694191537959159 Parent Tweet: https://x.com/craigkerstiens/status/1824114371737616794

Change Data Capture(CDC) is hard, it has 100s of edge cases / failure points. At PeerDB (https://www.peerdb.io/) , instead of focusing on multiple sources we just focused on Postgres. This helped us ensure that we gave enough care to iron out as many edge cases as possible. We were also able to implement a bunch of Postgres native performance and reliability optimizations. Our engineering blog https://blog.peerdb.io/ more on the optimizations and how we ironed out edge cases.

Pipeline failures have been rare these days, and so far none of the source databases were affected due to load. Also, most of our customers are in the shorter tail, i.e., data sizes over 300-400GB to 15-20TB. This helped battle test the product and make seamless for the long tail.

However, I don’t think CDC is a solved problem, as Postgres is full of mysteries and it keeps evolving. We need to continue polishing the experience and evolve along with Postgres!

TL;DR specialized CDC tools that focus on a single (or limited) database are reliable ways to provide a solid CDC experience.

vijaymohan1979 a year ago

Thanks for this post. We want to add support for Postgres in our product https://redfly.ai

Does Postgres support anything like sql server change tracking? This does not require the creation of extra tables to know the history in detail which causes high disk I/o in resource constrained environments.

Another question - does Postgres support anything like the sql server timestamp column?

Thanks for this post. Any information in this area is helpful as it is hard to find.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection