Settings

Theme

Data Scientists Should Be Able to Deploy and Iterate Their Own Models

blog.algorithmia.com

25 points by mikeyanderson 7 years ago · 5 comments

Reader

hadsed 7 years ago

I fully agree, but the tooling is super immature. I think there's going to be incredible opportunity for engineers to build tools for doing ML in a very efficient and scientifically rigorous way.

For instance, Jupyter is the best thing we have to an IDE for science. It is an incredibly innovative project, but it is not what we need. We need a Photoshop, a Visual Studio, a Final Cut Pro for doing ML.

There are a lot of interesting projects out there solving some of these problems. My favorite ones are Prodigy (by Explosion AI), Pachyderm, Paperspace to name a few. But it'll be a decade I think until we get to a serious place with it as an industry.

I myself have found the process of understanding models after training is incredibly difficult. I'm talking analyzing misclassifications, visualizing embeddings, and looking at saliency maps. We just don't know enough about how models work, and when we do it's only after great effort that most small shops don't have the resources for. This was true when I was trying to get my last company off the ground and is still true now that I'm running ML at my current company. There is a pretty big opportunity especially given that most cloud ML companies seem to focus on just training and deployment. Thinking about trying to start this myself actually given how much of a pain point it is for me today.

  • yboris 7 years ago

    For model versioning / keeping track of ML experiments, I can recommend Comet ML - https://www.comet.ml/ Takes just about 1 line of code to set it up (and of course you can customize it a lot after).

  • kfk 7 years ago

    Why not building photoshop or visual studio etc on top of jupyter? Seems to be the right direction at this point

    • hadsed 7 years ago

      I think it could be done. My thought is that it needs to be well integrated with cloud tools where you can organize and run your data processing jobs for it to be really compelling. The transition between crummy dev work and turning it into something serious with jupyter isn't really easy at this point.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection