Settings

Theme

DAWN: Tools for AI and Data Product Development

dawn.cs.stanford.edu

215 points by indescions_2017 8 years ago · 14 comments

Reader

pella 8 years ago

summary(slides): http://dawn.cs.stanford.edu/assets/dawn-overview.pdf

  • hacker_9 8 years ago

    The link to Snorkel [1] is really interesting, labeling data in a low quality programmatic way, which is then fed through a neural network to produce high quality labels is really smart.

    [1] https://github.com/HazyResearch/snorkel

    • myth_drannon 8 years ago

      Yes, Snorkel and DeepDive look extremely useful. At my job we have a lot of data but it's unlabeled, it will cost millions of dollars to outsource it to India for labeling/data entry.

cl42 8 years ago

This is great. In some ways it reminds me of the recent "Software 2.0" posts around here -- make the code and architecting so easy that we begin teaching machines by creating data rather than writing code.

tuomosipola 8 years ago

I like many of these ideas, they address real practical problems in the area and new research is always welcome. How this all will be packaged into a working environment is not clear to me but even the individual parts should be useful.

thisisit 8 years ago

So anyone using these tools in live/production environment?

  • mateiz 8 years ago

    Matei Zaharia (one of the PIs on DAWN) here. Snorkel, MacroBase and ASAP are already being used in production at several companies, and we intend to continue publishing everything as open source. We only started this lab a year ago, so a lot of the projects listed are still new.

  • tuomosipola 8 years ago

    Would be interesting to hear any experiences. These researchers have background in Spark etc. so setting up might not be that difficult.

tabtab 8 years ago

Seems it has similar goals to the idea behind factor tables: https://github.com/RowColz/AI

  • Houshalter 8 years ago

    I'm pretty sure that guy just reinvented nearest neighbor.

    • tabtab 8 years ago

      Finding the "nearest matching pattern" is part of just about ANY pattern matching. The devil-of-the-detail is dealing with noise, precision-loss-for-speed, generalization (compression), tuning, etc. This attempts to break such down into staff-digestible chunks.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection