Settings

Theme

Torus: A Toolkit for Docker-First Data Science

medium.com

74 points by augustflanagan 8 years ago · 14 comments

Reader

saamm 8 years ago

This is interesting! It sounds like this v1 gets your local environment up and running in a Docker container. I maintain something similar for analysts on my team, and we've seen success in terms of decreasing time spent on environment setup.

As another interesting use of Docker in the data space, I'm excited about Pachyderm [0] (though I haven't had the chance to use it in production). In particular, the data provenance story seems compelling.

0: https://github.com/pachyderm/pachyderm

  • jdoliner 8 years ago

    Thanks for the plug saamm, I'm one of the creators of Pachyderm. I think Torus and Pachyderm would work very nicely together. You could go straight from developing code in the image Torus provides to deploying it on Pachyderm as a production pipeline that runs on new data as it comes in with just a few commands. Similarly, their Dockerized data science cookie-cutter could work nicely as a Pachyderm service, this would work similar to using the service on your laptop, except that you could easily deploy it on a cloud provider and schedule it with GPUs and it will get updated with new data as it comes in.

    Very exciting to see more people applying containers to data science.

    • sdeymanifold 8 years ago

      Yes to containers! We are trying to make it as seamless as possible to be Docker first in all things. And not reinvent the devops wheel. It just needs to be adapted for the needs ot data scientists. Pachyderm is really cool. I will have to check it out. We've recently moved to Airflow for all our pipeline management... how does Pachyderm fit in that ecosystem?

      • jdoliner 8 years ago

        Pachyderm's pipeline system covers much of the same functionality as Airflow's so there's generally not much reason to use both.

zimbatm 8 years ago

No to confuse with this other company at https://manifold.io/ (io, not ai) which deprecated their https://www.torus.sh/ project :)

robohamburger 8 years ago

I think a more interesting direction would be for jupyter lab to ship an electron app and have it able to understand how to spin up and talk to containerized kernels.

I made a hacky version for work that proxies to a k8s pod but first class support would be cool.

mr_toad 8 years ago

Is cookie cutter running inside the docker container or on the host? The instructions imply setting up python, virtualenv, pip and cookie cutter all on the local machine...

stmw 8 years ago

Isn't this was Pachyderm was supposed to do?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection