Settings

Theme

Gazette – Build platforms that flexibly mix SQL, batch, and stream processing

github.com

46 points by twooster 4 years ago · 8 comments

Reader

jgraettinger1 4 years ago

Hi, I’m the primary author of gazette. Interesting seeing it here, happy to answer questions.

We’re continuing to improve it - it’s a core implementation detail of our current project & company Estuary Flow (https://estuary.dev), which aims to further simplify and democratize low latency data products.

jimsparkman 4 years ago

The slides [1] helped me grok what problems this tool solves pretty quickly.

Everything big data is moving to blob storage these days, but streaming can lead to small files problem or longer latencies. File fragments stored locally with proxied readers seems like a simple solution to that.

[1] https://gazette.readthedocs.io/en/latest/overview-slides.htm...

  • sitkack 4 years ago

    Small files should be in arrow/avro/parquet if your architecture allows (one should strive for this from the beginning).

akshayshah 4 years ago

The GitHub README and docs mention that Gazette has been running in production for 5 years, but I don’t see any mention of _where_. I assume this began as an internal project at some company - does anyone know which?

  • jaw0 4 years ago

    it came out of arbor.io (now part of liveramp.com)

    • dominotw 4 years ago

      > liveramp.com

      I am going to give this project a big no if it has anything to do with with liveramp.

johnthescott 4 years ago

in the postgresql world logical replication is boon. out-of-band immutable datastore is key, as well.

we are looking forward to qualified replication in upcoming pg15.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection