Using the Dropbox Datastore API in Python
dropbox.comThis could be a fantastic opportunity for startups to scale more quickly without most of the per-user costs. If the user brings his own storage, I don't have to pay Google/Amazon/etc. to put the user's data in my app's datastore. Since Dropbox has utility to the user beyond just my app, I don't have to convince the user to start a subscription with me to cover his share of my costs; he already has a Dropbox account. As long as Dropbox comes with 2GB free storage, he doesn't even have to be a paying customer.
I love the sound of this but remain cautiously optimistic. One thing that immediately comes to mind: How do you prevent the user from breaking their own user experience by deleting folder X or moving file Z?
My thought would be to create a locked folder accessible only by APPX and then if the user decides they don't want to use the service any more and revokes permission to APPX the folder gets deleted.
Or is the datastore invisible to the front end UI and Dropbox application?
Datastores are only visible on dropbox.com (but somewhat buried there) and via the API, so this isn't much of a concern.
Some see it as their responsibility to give the user a good user experience, even over the wishes of the user himself. Now, we can have a sensible debate over the need to warn a user about unintended consequences, but should we really be building features that act solely to restrict a user?
Been playing with the datastore and the data is not visible in the Dropbox folder. I'm sure it's somewhere there buried under a safe blanket. So far only it's data sync limitations are a small annoyance but that can be resolved.
I wonder if Guido is behind this. He wrote the Datastore API for AppEngine, and transferred to Dropbox last year.
Yes, Guido wrote the datastores part of the Python SDK and also played a big role in the design of the Datastore API itself.
Are dropbox front-end and apis built with Python at first place?
I'm not sure I understand the question. Dropbox uses Python heavily both server-side and client-side. Does that help?
Sure , i was talking about the website users and api clients "see" , as opposed to "back-end" , ie , all the infrastructure used to manage files,ect...
Any benchmarks? Number insert / second? Num of query / second?
What are the limit on number of records, number of field (column?) in a table? How many tables can one have?
How often is the commit? Is it commit to local/fs or to the cloud (under what condition)?
How does the datastore sync from different devices works?
I don't have benchmarks to share, but the documentation goes over size limits: https://www.dropbox.com/developers/datastore/docs/python
As to the sync model, changes are always made locally and then queued for upload. You might want to read Guido's posts about conflict resolution to understand the model:
https://www.dropbox.com/developers/blog/48/how-the-datastore...
https://www.dropbox.com/developers/blog/56/how-the-dropbox-d...
Thx, smarx
Good info!
10MB database limit is too small - somehow I would image dropbox wants app to create hundreds of MB if not GB of data. :-)
Are there any test code available? Open Source / on github?
It is much easier to understand the flow from the test code than API doc. If test code is available, it is easily to morph that info benchmark type code.
The Python library is open source, but no, I don't think we have any open source test code.
Keep in mind that file data should go outside of datastores (in files), so the 10MB is for the per-user, per-app structured data. E.g. contacts, app settings, game state, etc. 10MB actually goes a pretty long way there.
Also keep in mind that, at present, all of the SDKs load the entire datastore into memory, so there's a pretty low limit to how big you want a datastore to get. 10MB is a comfortable limit for now.
Thx again for the info.
I see why the 10MB limit.
What's the Pro/Con of this compare to just use sqlite on a dropbox volume?
From what I read, the API doesn't do any operational transformation such as merge, delta the changes' etc. The client app is expected to do it.
The main pro is the automatic merging. "The API" is a fuzzy term here. There's an interaction between a client and a server, and the client is running an SDK. In the case of the Datastore API, the server doesn't perform any merging or OT, but the client SDK does. Your code just gets conflict resolution for free. The Python SDK is a bit of a special case in that it doesn't implement this logic, but the others (JavaScript, iOS, Android) do.
In contrast, if you make a change to a sqlite database on two different devices, you now have two different files and no way to merge them. (There are people who have used Dropbox to sync sqlite databases this way, and they've ended up writing diff/patch over sqlite to merge changes. Using datastores is a lot simpler and more likely to be correct.)
Hmmmmm, my understanding of OT is that you have to implement that on Server.....
Only the server has understanding of the most current states of the dataset and can try to sync up with multiple client at the same time.
This is incorrect. OT can indeed be implemented on the client.
The way it works in the Datastore API is that the client sends its changes to the server with an attached "parent revision." If that parent revision is the revision that the server has, then the change goes through (no conflict). If the revision doesn't match, then the change is rejected by the server, and it's up to the client to pull down the latest changes from the server, merge things (via, in the case of lists, OT), and then try again.
I wonder whether it's intentional that the tutorial isn't "copy-pastable".
Specifically, it does not spell out any of the imports, and uses very large except clauses.
Hi, I wrote the tutorial. :-)
In general, I based the tutorial on the sample app that ships with the SDK. That app uses Flask and presents some actual UI. If you want full working code, I would suggest taking a look at that sample.
The tutorial is basically fragments from that sample that show the basic concepts, but without some boilerplate, the fragments are not themselves runnable.
(BTW, where are the "very large except clauses?")
Ha! Nice to see you here : )
The except clause is in the "dropbox_auth_finish" view. That might be personal paranoia, but I'm sure that every single timeI think "It's OK to put a `except:` here", it eventually comes back to bite me (without exception ;) )!
I do understand the motivation to keep it minimal (and I think the Flask boilerplate is indeed well understood), but putting together the sample without the Dropbox SDK imports (specifically without an IDE) might end up being a bit more bothersome than optimal : )
Just personal opinion, of course!
Oh, I misunderstood "very large." :-) The actual code sample has a more detailed list of exceptions that it handles, but the code was a bit long.
If I knew Python more I would write a Sublime Text plugin that syncs my settings over Dropbox.
Not a Sublime user, but you can do this with many (most?) apps with simlinks from Dropbox folders into the "~/Library/Application Support" folders of your various OS X devices.
Some apps won't enjoy being open simultaneously, but if you keep this in-mind, this is generally a great solution.
Therefore Dropbox, not a homegrown process, syncs your settings