Probabilistic Programming and Bayesian Methods for Hackers (2013)

github.com

188 points by wojciem 6 years ago · 17 comments

Reader

dang 6 years ago

A thread from 2016: https://news.ycombinator.com/item?id=12330462

2015: https://news.ycombinator.com/item?id=9182332

2014: https://news.ycombinator.com/item?id=7297195

2013: https://news.ycombinator.com/item?id=6351681

https://news.ycombinator.com/item?id=6102782

https://news.ycombinator.com/item?id=5817713

zengid 6 years ago

I'm quite familiar with deep learning because it's so highly hyped up, but am less familiar with Probabilistic programming and Bayesian methods. So, I have a general question: is anyone using Probabilistic Programming in industry? Have people ditched it for DNNs? Are people taking hybrid approaches to try and mix the two?

ssivark 6 years ago

Can’t answer for the whole industry. The two methods have very complementary strengths and weaknesses—so which one you apply will depend on the constraints of the domain (eg: large amounts of training data vs intelligent priors). If you’re lucky, and the situation enables it, you could build out both and ensemble them and hope to get the best of both worlds.
I still think the software stack for probabilistic programming has a ways to go before it becomes as easy to use as a NN using PyTorch, but it should get there in the near future. I’m personally very very excited about the probabilistic programming approach — conceptually it’s a very smooth segue from structured numerical algorithms, and allows you to really exploit problem structure if you have good domain understanding.
For me, it helps organize a lot of well-known algorithms as special cases of a general framework—which is worthwhile in itself. If I can code in the generic framework, and have the compiler generate the appropriate (optimized) special case algorithm (as one hopes), that’s icing on the cake.
- zengid 6 years ago
  
  Thanks for explaining. That seems to be the neat thing that I'm picking up on about what Microsoft Research people keep calling "Model Based Machine Learning" [0], in that you construct a model based on your assumptions about the problem, and by implementing it, the compiler can fit it to an appropriate algorithm.
  [0] https://www.youtube.com/watch?v=zKUFSKRjTIo and also https://github.com/dotnet/infer
  - ssivark 6 years ago
    
    Yup! Their WIP book is a fantastic introductory read: http://www.mbmlbook.com/ (Needs only high school background)
    Helps gain very nice and concrete intuition, before getting lost in math or code.
- tr352 6 years ago
  
  > For me, it helps organize a lot of well-known algorithms as special cases of a general framework—which is worthwhile in itself.
  Which well known algorithms do you have in mind here?
  - ssivark 6 years ago
    
    For a flavor, see Tom Minka’s recent talk/slides “From automatic differentiation to message passing”.
    Difficult to give a quick answer. I’m also not aware of any good resources where this is spelled out. If you’re seriously interested, feel free to hit me up for a deeper discussion.
    
    tr352 6 years ago
    
    Thanks I’ll check that out. My own understanding is that pretty much any probabilistic graphical model can be constructed as a probabilistic program, combined with pretty much any mode of inference. How such programs compare to specialized algorithms in terms of efficiency is not clear to me. I’m asking because my understanding is based mostly on theory and I’d like to learn more about probabilistic programming in practice.
rsrsrs86 6 years ago

I use both, and for very different problems.

hinkley 6 years ago

The first time I saw this phrase I thought it was going to describe something like software branch prediction, speculative execution, self tuning algorithms, or heck even Bloom filters or hyperloglog. That was a direction my first mentor and I used to talk about and it’s one of my regrets that I never did much in that arena.

My brain wants this term to mean something else and I become momentarily excited every time this topic gets reposted.

ssivark 6 years ago

You might find this book very interesting: https://www.amazon.com/Probabilistic-Data-Structures-Algorit...
I’m unrelated to the author; just came across the book on a Reddit discussion and found it interesting. There aren’t too many (collected) discussions of these kinds of topics, AFAIK.
- dehrmann 6 years ago
  
  Might also be worth reading into fuzzy logic.

xvilka 6 years ago

A corresponding framework in Julia - Turing[1][2].

https://turing.ml

https://github.com/TuringLang

kevinskii 6 years ago

It has been a few years since I looked at this, and it looks like a lot has been added since then. It's certainly worth a look. But at the time I found Allen B. Downey's "Think Bayes" to be a more thorough and comprehensive resource: https://greenteapress.com/wp/think-bayes/

glial 6 years ago

For anyone interested in learning more, Stan is an excellent alternative probabilistic programming language:

https://mc-stan.org

with thorough documentation:

https://mc-stan.org/users/documentation/

inertiatic 6 years ago

Oh, I've been meaning to go through this book as we used PyMC (I only did reviews and this went over my head a lot) on my last job to build an AB testing system.

I recently started going through it again and it's pretty fascinating as someone not familiar with the field.

odyslam 6 years ago

Thanks for sharing! I have been wanting to get into the field in order to play around with Monte Carlo Random Walks.

Settings

Probabilistic Programming and Bayesian Methods for Hackers (2013)

Keyboard Shortcuts