DoltLab v0.2.0
dolthub.comInspired by... but admittedly off-topic... I had recently been wondering how people keep a full auditable history of their data. I've used Hibernate Envers in the past or the now seemingly-defunct Temporal Tables extension for PostgreSQL. What are people using these days? Is DoltLab it, or are there more common solutions?
Can’t say what people are using but keen to hear from others. I am looking into this atm, no implementation yet, and some of the things I am reading about are (I’ll add dolt to this list)
Xtdb: https://xtdb.com/
Terminus: https://github.com/terminusdb/terminusdb
Nessie: https://projectnessie.org/
Dvc : https://dvc.org
Liquid base: https://liquibase.org/
Datasette: https://datasette.io/
Still framing in my mind how schema evolution, x-temporal and e.g. scd in data modelling, version control etc. tie in together in an approach.
Not what I was looking for, but a nice list. Just want to call out, what I've also seen in the article that a sibling comment pointed to: if you're gonna mention Liquibase, why would you not mention Flyway: https://flywaydb.org/
Good luck with your search.
No specific reason other than not having come across it. Will add it on my list.
They wrote a blog about database versioning solutions a little while back, which distinguishes schema vs. data versioning tools. I can't say for sure how comprehensive it is, but presumbly it touches on the more well known solutions.
https://www.dolthub.com/blog/2021-09-17-database-version-con...
I'm surprised you didn't include things like `dat`[0]
This is actually a really cool idea, and while I would have avoided it due to it's SAAS nature, now I'm actually pretty willing to try it.
I can’t figure out if this is a real product or a joke site? The name is confusing.
If it’s a real product it’s cool, I’ve wanted something like this for a while (currently I just use git repos full of JSON files but this would be better I think).
Yup, it's a real product :)
If you want experiment quickly and aren't squeamish about putting your data on the internet, DoltHub is easier to get started with. DoltLab is just a (limited) self-hosted version of DoltHub.
Nice. While I haven't used Dolt, I've definitely enjoyed reading the blog, sepecially the MySQL compatability stuff and some of the other fun ones (like the alcohol dispenser project). Good luck guys!
Maybe it's a regional slur here in the U.S. but I'm wondering where the name originated.
It's a play on git, which is itself regional slang.
So many questions about the git-like semantics.
- how are shas created?
- assuming you hash the entire diff, can columns be ignored? e.g. timestamps or other "unimportant" data
- do any two insertions into a table "conflict"?
To quickly answer these...
- it's content addresses / merkle DAG all the way down. The commit's hash is something like meta (author, description, timestamp) + parents hash + root value hash. The root value is composed of the schema, and pointers to table and index maps. Tables and indexes are merkle DAGs of the table data organized in a structure a bit like a B-tree, but with cut points chosen by a rolling value hash in order to probabilistically re-synchronize on incremental changes. Some details: https://www.dolthub.com/blog/2020-04-01-how-dolt-stores-tabl... , https://www.dolthub.com/blog/2020-05-13-dolt-commit-graph-an...
- currently table data is stored row major for full rows of the table and so diffing cannot efficiently ignore individual columns.
- direct conflicts are computed on a row-by-row basis, using the primary key of the row. And then constraints and foreign key references are maintained and validated across merges and edits.
HTH, happy to answer any further questions :).
Looks interesting. It’s like Git, ZFS, and MySQL all in one?