Using the Dropbox Datastore API in Python

39 points by sean_lynch 13 years ago · 25 comments

Reader

This could be a fantastic opportunity for startups to scale more quickly without most of the per-user costs. If the user brings his own storage, I don't have to pay Google/Amazon/etc. to put the user's data in my app's datastore. Since Dropbox has utility to the user beyond just my app, I don't have to convince the user to start a subscription with me to cover his share of my costs; he already has a Dropbox account. As long as Dropbox comes with 2GB free storage, he doesn't even have to be a paying customer.

instaheat 13 years ago

I love the sound of this but remain cautiously optimistic. One thing that immediately comes to mind: How do you prevent the user from breaking their own user experience by deleting folder X or moving file Z?

My thought would be to create a locked folder accessible only by APPX and then if the user decides they don't want to use the service any more and revokes permission to APPX the folder gets deleted.

instaheat 13 years ago

Or is the datastore invisible to the front end UI and Dropbox application?
- smarx 13 years ago
  
  Datastores are only visible on dropbox.com (but somewhat buried there) and via the API, so this isn't much of a concern.
pavpanchekha 13 years ago

Some see it as their responsibility to give the user a good user experience, even over the wishes of the user himself. Now, we can have a sensible debate over the need to warn a user about unintended consequences, but should we really be building features that act solely to restrict a user?
semerda 13 years ago

Been playing with the datastore and the data is not visible in the Dropbox folder. I'm sure it's somewhere there buried under a safe blanket. So far only it's data sync limitations are a small annoyance but that can be resolved.

bsimpson 13 years ago

I wonder if Guido is behind this. He wrote the Datastore API for AppEngine, and transferred to Dropbox last year.

smarx 13 years ago

Yes, Guido wrote the datastores part of the Python SDK and also played a big role in the design of the Datastore API itself.
- camus2 13 years ago
  
  Are dropbox front-end and apis built with Python at first place?
  - smarx 13 years ago
    
    I'm not sure I understand the question. Dropbox uses Python heavily both server-side and client-side. Does that help?
    
    camus2 13 years ago
    
    Sure , i was talking about the website users and api clients "see" , as opposed to "back-end" , ie , all the infrastructure used to manage files,ect...

tonyplee 13 years ago

Any benchmarks? Number insert / second? Num of query / second?

What are the limit on number of records, number of field (column?) in a table? How many tables can one have?

How often is the commit? Is it commit to local/fs or to the cloud (under what condition)?

How does the datastore sync from different devices works?

smarx 13 years ago

I don't have benchmarks to share, but the documentation goes over size limits: https://www.dropbox.com/developers/datastore/docs/python
As to the sync model, changes are always made locally and then queued for upload. You might want to read Guido's posts about conflict resolution to understand the model:
https://www.dropbox.com/developers/blog/48/how-the-datastore...
https://www.dropbox.com/developers/blog/56/how-the-dropbox-d...
- tonyplee 13 years ago
  
  Thx, smarx
  Good info!
  10MB database limit is too small - somehow I would image dropbox wants app to create hundreds of MB if not GB of data. :-)
  Are there any test code available? Open Source / on github?
  It is much easier to understand the flow from the test code than API doc. If test code is available, it is easily to morph that info benchmark type code.
  - smarx 13 years ago
    
    The Python library is open source, but no, I don't think we have any open source test code.
    Keep in mind that file data should go outside of datastores (in files), so the 10MB is for the per-user, per-app structured data. E.g. contacts, app settings, game state, etc. 10MB actually goes a pretty long way there.
    Also keep in mind that, at present, all of the SDKs load the entire datastore into memory, so there's a pretty low limit to how big you want a datastore to get. 10MB is a comfortable limit for now.
    
    tonyplee 13 years ago
    
    Thx again for the info.
    I see why the 10MB limit.
    What's the Pro/Con of this compare to just use sqlite on a dropbox volume?
    From what I read, the API doesn't do any operational transformation such as merge, delta the changes' etc. The client app is expected to do it.
    
    smarx 13 years ago
    
    The main pro is the automatic merging. "The API" is a fuzzy term here. There's an interaction between a client and a server, and the client is running an SDK. In the case of the Datastore API, the server doesn't perform any merging or OT, but the client SDK does. Your code just gets conflict resolution for free. The Python SDK is a bit of a special case in that it doesn't implement this logic, but the others (JavaScript, iOS, Android) do.
    In contrast, if you make a change to a sqlite database on two different devices, you now have two different files and no way to merge them. (There are people who have used Dropbox to sync sqlite databases this way, and they've ended up writing diff/patch over sqlite to merge changes. Using datastores is a lot simpler and more likely to be correct.)
    
    tonyplee 13 years ago
    
    Hmmmmm, my understanding of OT is that you have to implement that on Server.....
    Only the server has understanding of the most current states of the dataset and can try to sync up with multiple client at the same time.
    
    smarx 13 years ago
    
    This is incorrect. OT can indeed be implemented on the client.
    The way it works in the Datastore API is that the client sends its changes to the server with an attached "parent revision." If that parent revision is the revision that the server has, then the change goes through (no conflict). If the revision doesn't match, then the change is rejected by the server, and it's up to the client to pull down the latest changes from the server, merge things (via, in the case of lists, OT), and then try again.

krallin 13 years ago

I wonder whether it's intentional that the tutorial isn't "copy-pastable".

Specifically, it does not spell out any of the imports, and uses very large except clauses.

smarx 13 years ago

Hi, I wrote the tutorial. :-)
In general, I based the tutorial on the sample app that ships with the SDK. That app uses Flask and presents some actual UI. If you want full working code, I would suggest taking a look at that sample.
The tutorial is basically fragments from that sample that show the basic concepts, but without some boilerplate, the fragments are not themselves runnable.
(BTW, where are the "very large except clauses?")
- krallin 13 years ago
  
  Ha! Nice to see you here : )
  The except clause is in the "dropbox_auth_finish" view. That might be personal paranoia, but I'm sure that every single timeI think "It's OK to put a `except:` here", it eventually comes back to bite me (without exception ;) )!
  I do understand the motivation to keep it minimal (and I think the Flask boilerplate is indeed well understood), but putting together the sample without the Dropbox SDK imports (specifically without an IDE) might end up being a bit more bothersome than optimal : )
  Just personal opinion, of course!
  - smarx 13 years ago
    
    Oh, I misunderstood "very large." :-) The actual code sample has a more detailed list of exceptions that it handles, but the code was a bit long.

msoad 13 years ago

If I knew Python more I would write a Sublime Text plugin that syncs my settings over Dropbox.

stblack 13 years ago

Not a Sublime user, but you can do this with many (most?) apps with simlinks from Dropbox folders into the "~/Library/Application Support" folders of your various OS X devices.
Some apps won't enjoy being open simultaneously, but if you keep this in-mind, this is generally a great solution.
Therefore Dropbox, not a homegrown process, syncs your settings

Settings

Using the Dropbox Datastore API in Python

Keyboard Shortcuts