DoltLab v0.2.0

41 points by mjangle1985 4 years ago · 16 comments

Reader

Inspired by... but admittedly off-topic... I had recently been wondering how people keep a full auditable history of their data. I've used Hibernate Envers in the past or the now seemingly-defunct Temporal Tables extension for PostgreSQL. What are people using these days? Is DoltLab it, or are there more common solutions?

cgio 4 years ago

Can’t say what people are using but keen to hear from others. I am looking into this atm, no implementation yet, and some of the things I am reading about are (I’ll add dolt to this list)
Xtdb: https://xtdb.com/
Terminus: https://github.com/terminusdb/terminusdb
Nessie: https://projectnessie.org/
Dvc : https://dvc.org
Liquid base: https://liquibase.org/
Datasette: https://datasette.io/
Still framing in my mind how schema evolution, x-temporal and e.g. scd in data modelling, version control etc. tie in together in an approach.
- sverhagen 4 years ago
  
  Not what I was looking for, but a nice list. Just want to call out, what I've also seen in the article that a sibling comment pointed to: if you're gonna mention Liquibase, why would you not mention Flyway: https://flywaydb.org/
  Good luck with your search.
  - cgio 4 years ago
    
    No specific reason other than not having come across it. Will add it on my list.
richardbarosky 4 years ago

They wrote a blog about database versioning solutions a little while back, which distinguishes schema vs. data versioning tools. I can't say for sure how comprehensive it is, but presumbly it touches on the more well known solutions.
https://www.dolthub.com/blog/2021-09-17-database-version-con...
- parentheses 4 years ago
  
  I'm surprised you didn't include things like `dat`[0]
  [0] https://github.com/dat-ecosystem-archive/dat

vorpalhex 4 years ago

This is actually a really cool idea, and while I would have avoided it due to it's SAAS nature, now I'm actually pretty willing to try it.

caffeine 4 years ago

I can’t figure out if this is a real product or a joke site? The name is confusing.

If it’s a real product it’s cool, I’ve wanted something like this for a while (currently I just use git repos full of JSON files but this would be better I think).

zachmu 4 years ago

Yup, it's a real product :)
If you want experiment quickly and aren't squeamish about putting your data on the internet, DoltHub is easier to get started with. DoltLab is just a (limited) self-hosted version of DoltHub.

richardbarosky 4 years ago

Nice. While I haven't used Dolt, I've definitely enjoyed reading the blog, sepecially the MySQL compatability stuff and some of the other fun ones (like the alcohol dispenser project). Good luck guys!

smoyer 4 years ago

Maybe it's a regional slur here in the U.S. but I'm wondering where the name originated.

reltuk 4 years ago

It's a play on git, which is itself regional slang.

parentheses 4 years ago

So many questions about the git-like semantics.

- how are shas created?

- assuming you hash the entire diff, can columns be ignored? e.g. timestamps or other "unimportant" data

- do any two insertions into a table "conflict"?

reltuk 4 years ago

To quickly answer these...
- it's content addresses / merkle DAG all the way down. The commit's hash is something like meta (author, description, timestamp) + parents hash + root value hash. The root value is composed of the schema, and pointers to table and index maps. Tables and indexes are merkle DAGs of the table data organized in a structure a bit like a B-tree, but with cut points chosen by a rolling value hash in order to probabilistically re-synchronize on incremental changes. Some details: https://www.dolthub.com/blog/2020-04-01-how-dolt-stores-tabl... , https://www.dolthub.com/blog/2020-05-13-dolt-commit-graph-an...
- currently table data is stored row major for full rows of the table and so diffing cannot efficiently ignore individual columns.
- direct conflicts are computed on a row-by-row basis, using the primary key of the row. And then constraints and foreign key references are maintained and validated across merges and edits.
HTH, happy to answer any further questions :).

hestefisk 4 years ago

Looks interesting. It’s like Git, ZFS, and MySQL all in one?

Settings

DoltLab v0.2.0

Keyboard Shortcuts