Cognitect: Relevance merges with Metadata Partners (Datomic)

cognitect.com

107 points by AndreasFrom 12 years ago · 26 comments

Reader

calibraxis 12 years ago

More context in their podcast: (http://cognitect.com/podcast)

mike_ivanov 12 years ago

Any plans to opensource Datomic?

MrBuddyCasino 12 years ago

What is the big deal about Datomic?
From their FAQ:
"Datomic is not a good fit if you need unlimited write scalability, or have data with a high update churn rate (e.g. counters)."
Don't you get most of that through... caching? Also, it seems to assume that the dataset will fit into RAM.
- grayrest 12 years ago
  
  Datomic is interesting because it's a different take on what a database should look like. The TLDR version by someone who's looked into it a bit but not actually used it:
  * Storage, Transactions, and Querying are separated as in different processes/machines separated.
  * Data is immutable. Storage is pluggable and has implementations on top of Dynamo/Riak.
  * Transaction semantics and ordering are controlled by a single process for consistency. This is the write scaling caveat. It's less of a restriction than it sounds (if you're thinking SQLite2 like I did) because there aren't writes/queries competing for resources, it's just the sequencing.
  * Queries on the db are performed in-client and can interoperate with client code and state. When you write a query, datomic pulls the data from storage to the local machine and performs the query.
  * Queries are in a logic programming language called datalog. Even if you aren't interested in the rest, I'll recommend spending an hour working through http://learndatalogtoday.org/ just for the exposure to logic programming.
  - zerr 12 years ago
    
    You mean the whole data is fetched to the client and only queried afterwards? Why did they choose this way?
    
    jasonwatkinspdx 12 years ago
    
    Only the range of data the client is interested in is fetched from the storage layer.
    As for why they chose this, you'd have to ask them to be sure.
    But two reasonable assumptions are: 1. they wanted the storage layer to be "dumb", in particular so that they could use existing services like Dynamo. 2. they wanted reading processes to be totally independent. Readers can talk directly to the dumb storage layer without any centralized resource coordinator to execute queries. That means horizontal scalability in the strict sense.
    
    _halgari 12 years ago
    
    Only partial indexes are retrieved (what is needed to answer your exact query). The bonus is that that data is now local. Transversing deep structures then often approaches the speed of hash-map lookups. As someone who has worked on very complex SQL databases, this is a major win.
    
    icefox 12 years ago
    
    Only the data you need is fetched (and cached) so the client only has a subset of the database.
    
    dustingetz 12 years ago
    
    indexes and chunks of data that are used often remain cached in each application instance, and new changes are streamed to the application cache.
    It means that reads from a hot cache do not touch network. Reads are very fast and scale "out". You can write code that does a lot of reads without caring much about performance. (SQL reads only scale "up" and you care very much about their performance.)
    Datomic is like Git (distributed reads, central writes); Postgres is like CVS/SVN (centralized reads and writes). This is made possible by immutable history.
- dustingetz 12 years ago
  
  TLDR: Datomic is like Git.
  In traditional ACID databases (SQL), all queries (read and write) mostly only scale UP (beefier db machine), not OUT (lots of db machines is very hard). Datomic is an ACID database where writes still scale UP, but reads can scale OUT.
  Consequences of this separation of read and write means that datomic reads scale practically arbitrarily for both query load and for dataset size. Writes do not.
  This is a lot like Git, where you have to push to a central place which orders and rejects commits, but you can make useful reads from your local machine without touching network. Datomic is a lot like Git + realtime secret sauce.
  That's only half the value though - Datomic also doesn't have an object relational impedance mismatch. This means Datomic doesn't need ORMs; Datomic's programming model is simpler than SQL for a competitive set of features. So you code faster with less bugs.
- drcode 12 years ago
  
  In short, you can ask a datomic database stuff like "Show me all things that are different for customer X from the database today versus the database one year ago on September 14th at 9:32 AM" and it can answer those types of queries with high performance.
  And no, the dataset does not need to fit in RAM.
  - calibraxis 12 years ago
    
    You can also go forward in time, to a hypothetical future. (That is, you add data and get back a new DB value, which you can query against. But the DB's source isn't modified.) Can be useful in analytics which deal with what-if scenarios.
  - MrBuddyCasino 12 years ago
    
    Thanks, that helped. They should be more clear on their website about that, I know Clojure a bit and some of the things about state, time and identity, and I still didn't get it.
- bjeanes 12 years ago
  
  It does not at all assume or require that your dataset will fit in RAM. To an extent, it will cache some indexes in RAM of query peers, but there is no expectation that the whole dataset is in RAM.
calibraxis 12 years ago

Nope: https://twitter.com/cognitect/status/379605967211888640
1qaz2wsx3edc 12 years ago

Follow up: What are some open-source alternatives or similar software?
- jasonwatkinspdx 12 years ago
  
  I'm doing some preliminary work on one. But realistically, it's a lofty goal and it'll be hard to get going. My priorities are different and so I'm taking some different design paths than datomic as well (ie, no datalog).
- robertfw 12 years ago
  
  I have done some searching and have not turned up anything. The functionality that datomic enables is really intriguing, but I have trouble bringing myself around to using a non-open source bit of tooling.
- chrismonsanto 12 years ago
  
  I'd love to be proved wrong, but I don't think there are any (at least that aren't just research prototypes)

bfe 12 years ago

((defn cognitect ([] (conj [relevance] datomic)))) ; => all ur cljr r belong to us

praptak 12 years ago

I miss a piece of info here, could someone please fill in? Rich Hickey is known for Clojure, Metadata Partners for Datomic. What are the Relevance guys known for?

(Honest question, not a cheap attempt at dismissal :-) )

rsanders 12 years ago

Quite a few members of the Clojure core team and community work there, as evidenced by the intersection of http://thinkrelevance.com/team and http://clojure.com/about.html.
- _halgari 12 years ago
  
  In addition, Relevance has had a close relationship with Rich for years, as such we've had a major hand in the development of Clojure, ClojureScript, Datomic, core.async, Pedestal, Simulant, and many other Clojure projects.
bfe 12 years ago

Hosting a high density of clojure/core, and all that implies.

jawns 12 years ago

I attended a tech conference earlier this year where Rich Hickey was speaking. He was trumpeting the fact that data never really gets deleted in Datomic, and someone brought up the question: What happens if you are legally required to delete something from your database? I seem to remember him saying that Datomic wasn't really designed for that scenario, which sounds like a major problem.

andrewvc 12 years ago

That actually isn't the case, you can delete data in datomic. http://docs.datomic.com/excision.html

Settings

Cognitect: Relevance merges with Metadata Partners (Datomic)

Keyboard Shortcuts