Settings

Theme

Ask HN: Is deep learning obsession in college ill founded?

43 points by muazzam 6 years ago · 36 comments · 1 min read


Background:

I'm a CS junior (about to become a senior) and in our last year, people choose their capstone project that they work on for the entire year. For some years (say since 2017), deep learning projects completely dominate the other projects in term of number and the awards that they go on to win. I understand the principles behind it, even find it cool, but the whole inscrutable nature of it is problematic to me.

ageitgey 6 years ago

Deep learning is what is cool in CS right now. It lets you do new things that you couldn't do before. Based on that, it's going to be over-represented in projects by undergrads looking to do "cool" projects and show off their new-found skills.

But that isn't really a problem. In most cases, the projects you do as an undergrad don't affect your professional life in any way after you get your first job. Very, very few undergrad projects turn into real projects that anyone uses after the student graduates.

So don't worry too much about it. Ten years ago, every senior project was an app. Twenty years ago, every senior project was a website. It's just a sign of the times and doesn't matter in the long run.

rvz 6 years ago

> I understand the principles behind it, even find it cool, but the whole inscrutable nature of it is problematic to me.

Spot on.

On top of that we have 'AI' models getting fooled over adversarial attacks which just involve a single pixel change. As long as these issues are not tackled or not researched well enough, then we'll be pretty much be heading into another AI Winter and the hype cycle will go through its trough of disillusionment phase. Being unable to inspect the black-box nature of such deep-learning systems is why highly regulated industries involving danger to life such as healthcare or other safety critical industries label deep-learning solutions as unsafe for them.

Sure, all you see right now are other students and startups 'applying' deep learning everywhere, but they are hardly advancing the field unlike DeepMind and OpenAI are. In terms of learning, it's something good to learn as a student at college, but creating a AI startup now requires using Google, Amazon or Microsofts data center's for training which is clearly not sustainable anyway.

Security related projects and research are always where it's at.

  • visarga 6 years ago

    > As long as these issues are not tackled or not researched well enough, then we'll be pretty much be heading into another AI Winter and the hype cycle will go through its trough of disillusionment phase.

    It's not all or nothing as you present it. ML models can be useful even if they are imperfect - and we should not forget humans aren't perfect either. For example, a model could reduce 50% of the time necessary to enter an invoice into the database. It's imperfect, yet useful.

    A model need not run alone without any safety. It can have plain old programming rules to validate its outputs, or use human in the loop.

    > Sure, all you see right now are other students and startups 'applying' deep learning everywhere, but they are hardly advancing the field unlike DeepMind and OpenAI are.

    On the contrary, I would say that what DeepMind and OpenAI are doing is largely irrelevant for industry. There is a huge number of domains where no ML model has been created, and that is because there are so few people who can make them. The low hanging fruit hasn't been picked yet. It's like electricity at the beginning of the 20th century. The work these students and startups are doing is the good, useful work. You don't need DeepMind grade models to solve most real problems.

    > creating a AI startup now requires using Google, Amazon or Microsofts data center's for training

    You can train most useful models on a single machine today. Some, like Logistic Regression, train in seconds or minutes. Others take an hour, or a day. Some heavy ones take a week to train. If you don't do hyper-parameter search or cutting edge research you only need a few runs to get a working model. It's data tagging that usually takes months or years.

_bxg1 6 years ago

Personally I think deep-learning is a bubble, and it will soon collapse to its natural place in computer science. Which is not to say that it's a fad that will disappear, only that it will retreat to being just a regular tool among the many tools we have for solving different kinds of problems. Its inscrutable nature is definitely problematic for some use-cases, and not so problematic for others.

  • sin7 6 years ago

    I've been doing the data thing for a while. During one of my defenses of R, someone brought up that R was a black hole. That if you programmed in R, you were a user who just filled in the correct function arguments and it just spit out the answer. And that was when my thoughts on machine learning changed.

    The vast majority of us are users. We massage the data to be in a certain shape, then feed it through a machine that someone else created. We can change the parameters. We can change the data. But few of us are going to look in to the code of a random forest function.

    I've switched tracks and started doing web development. Playing with the hyper parameters in machine learning is no different than changing the feel of a drop down by changing the colors, fonts and other things to fit a certain aesthetic.

    I could be wrong, but I have yet to meet anyone that has done anything besides use packages created by others to call themselves data scientists. I think that opens it up to becoming just another tool no different than Excel.

    • ishjoh 6 years ago

      years ago on the first ML hype wave I completed the excellent MOOC by Andrew Ng. In that course, he did go through the math and it was helpful to me to understand what was going on under the hood, but even then the value of ML wasn't understanding what it was doing, but understanding if your model was doing something well. I think your take that using packages created by others will be mostly what we do moving forward, and that's also true of pretty much all software development.

    • itronitron 6 years ago

      I consider R to be one of the lower-level ML/DS languages, in that people that use R typically are fairly intentional about what they are doing.

      I've been working in this space for a long time and recently started reading up on a particular ML technique which gained a lot of popularity over the past five years. What strikes me about 95% of the material available is how over-hyped and uninformative it is, to the point of just being wrong.

  • visarga 6 years ago

    While I agree with your sentiment regarding ML engineers - they are just another kind of devs, and that's where it will go - I think DL is not just a tool like any other from the software toolbox. It's more like a paradigm changer, like the print, the engine, electricity, communication and computing. It tends to eat the world.

    • s1t5 6 years ago

      > It's more like a paradigm changer, like the print, the engine, electricity, communication and computing.

      Either we really disagree about deep learning, or you vastly underestimate the influence of the other technologies that you've listed.

      • mattkrause 6 years ago

        I'd maybe buy it if you broadened the claim to "quantification" or something. It's undoubtedly true that aggressively collecting and analyzing data has transformed society a lot: Taylorism, mass production, bureaucracies, science (not just data), even Guinness. However, this has been going on for ~150 years already.

        As for deep learning specifically...meh.

    • _bxg1 6 years ago

      That's exactly what I'm arguing against: I think the "eating the world" part is a hype-cycle. I think DL has truly revolutionized a handful of very narrow cases - computer vision and speech recognition/synthesis, for example - but that people are vastly over-estimating how "paradigm-changing" it actually is.

      • visarga 6 years ago

        Take just computer vision alone. It has applications in manufacturing, robotics, SDCs, medical scans, cartography, agriculture, and many others. It's like the motor - a universal tool.

      • Jack000 6 years ago

        yes, but reliable facial recognition alone has huge social implications, never mind all the other potential applications for cognitive automation.

  • tuatoru 6 years ago

    > Its inscrutable nature is definitely problematic for some use-cases, and not so problematic for others.

    It's a problem whereever reliable operation is required, or analytic tractability (explanation) is required, or where resources available for data labeling are limited.

    Its niche appears quite small, unless and until solid mathematical foundations are developed for it.

    • _bxg1 6 years ago

      It shines the most in "soft computing"; computer vision, etc. These also tend to be the least-important areas to explain or audit, partly just because they're so trivial to verify with nothing but human intuition.

      Where it becomes problematic - and where DL isn't actually very well-suited anyway - is making "real decisions"; things that would normally be backed by rigid logic.

  • uoaei 6 years ago

    NNs are "computer science" only insofar as numerical algorithms are. Which is to say, beyond the question of big-O, it's all math.

uoaei 6 years ago

Not ill-founded so much as jumping the gun.

To understand why neural networks work, you will have to understand how a whole host of smaller, simpler ML models work in excruciating detail. Multiple linear regression, logistic regression, etc. What they mean, how they work, what's really going on "inside", what the underlying probabilistic model represents, etc.

Neural networks are great because it takes basically all of those smaller ideas and concatenates them into a super flexible statistical machine. It's really cool to see the "in->out" but it's even cooler once you have a good grasp on what's going on in the intermediate steps.

In my experience, almost everyone working with neural networks don't have those details down. This goes 100-fold for non-research roles. They learned the Keras API and are happy stacking layers, and as long as the output looks nice they push to production. For most cases empirical validation is probably enough, because NNs usually can achieve some incremental improvement just by virtue of the fact they have so many damn degrees of freedom. But to get a well-performing, well-founded model, you need to know the ins and outs.

deuslovult 6 years ago

I'm an ML engineer, and I agree with you- deep learning is by far the most common approach for new problems in informatics.

Imo deep learning is so popular because it "works". For a classification problem, if you try a linear baseline and a deep learning model, and you do a reasonable job of hyperparameter tuning and experimental design, it's likely you will outperform a simpler model. This holds true across many problem spaces.

I think the issue is that modern DL frameworks make it a little too easy to get pretty good performance on new problems. Other techniques generally require more background knowledge to make reasonable modeling assumptions, and still frequently perform worse than a naively applied DL approach.

I think DL will remain, in practice and education, a very popular tool. But it is essential to learn traditional statistical inference and other background to appropriately contextualize DL models so it isn't just some form of black magic.

  • mattkrause 6 years ago

    A lot of those comparisons strike me as shaky.

    It's easy to beat a naive logistic regression model with a good neural network, but the gap often closes once you start trying to tune the logistic model too. (And it's not like the neural networks aren't tuned either--architecture search, data augmentation, etc).

    Recent review on medical data: https://www.sciencedirect.com/science/article/abs/pii/S08954...

    • deuslovult 6 years ago

      Logistic regression is exactly a NN with no hidden layers and a sigmoid activation function. A feedforward NN with additional layers is strictly more expressive than logistic regression.

      • mattkrause 6 years ago

        Yes! The million dollar question is how much of that expressivity is actually required.

        In many papers, the "baseline" logistic regression model is very stripped down: y~logit(.) but the neural network has had its expressiveness optimized in various ways. People aren't comparing against a 3 layer feedfoward network; there's augmentation and pre-training, architecture search and special learning schemes.

        My claim is that if you want to claim that a problem needs the expressivity that (only) a neural network provides, you ought to be devoting a great deal of effort to the logistic regression model too. Make it a steelman, rather than a strawman, if you will.

poulsbohemian 6 years ago

22 years ago when I was in your shoes, distributed systems were the topic of the day, and all of us were going to be building systems with CORBA and DCOM... so guess what my project and paper were about? That's right, things I never touched in my career, but darn it if they didn't help me get my first job because they were hot topics of the moment.

So, pick something in "AI" that is the hotness of the moment, learn what you can, do your best, and then get on with life and career.

md2020 6 years ago

My situation is the same as yours, CS junior heading into my capstone project next semester, and my opinion is a resounding yes. The deep learning obsession is almost certainly a hype bubble. I have observed the same here at my university, the "But what if we did it with deep learning?" projects are almost reaching meme status. It's rather disheartening as someone who actually is interested in AGI, but I've been driven away from wanting to pursue the field since the current research seems lacking in ambition and substance. My previous summer internship had me reading a lot of deep learning papers on arXiv, and the vast majority of them seem to be tweaking a single parameter in a DNN, achieving a 0.3% increase in score on an arbitrary benchmark, and calling it a meaningful result. I'd personally like to see more people doing work like the kind DeepMind does that seems to actually achieve breakthroughs informed by knowledge from neuroscience, but I have a feeling we won't see that anytime soon since DeepMind gets their pick of the best researchers in the world. I'm just an undergrad though, would love to hear the opinions of more knowledgeable people! Specifically, I'd like to hear arguments against the sentiment that "Deep learning right now is pretty much alchemy". How is the work in deep learning helping us understand the nature of intelligence, rather than just helping Facebook and Amazon better target advertisements and product recommendations?

  • visarga 6 years ago

    > How is the work in deep learning helping us understand the nature of intelligence

    Neural networks performance on a problem is a benchmark of its real difficulty. It gives us insight, a new perspective.

    In millennia of deliberations what have philosophers have discovered about the nature of intelligence? And then .. a neural net beats us at all board games, another can solve differential equations, another can translate, another can see, and so on. Have we really not learned anything by these inventions?

    Another advantage of DL is that it frames the problem of intelligence in mathematical concepts and rigorous evaluation.

    I, for one, have reconsidered all my spiritual beliefs after learning about the agent-environment-reward model of reinforcement learning. A new way of framing the agent and life, so parsimonious and powerful. And it does not require a soul, or a god, or anything outside out real environment, and yet can explain so much.

    The whole machine learning paradigm is another powerful concept through which we can understand how we might function. Previously you might wonder how emotion, thought, sensation, imagination and will relate to each other. Now we can understand how they might be implemented and wired together, and what principles support their function.

    • md2020 6 years ago

      > And then .. a neural net beats us at all board games, another can solve differential equations, another can translate, another can see, and so on. Have we really not learned anything by these inventions?

      I would still argue no, we haven't learned anything about intelligence from these. They are impressive achievements, but strictly in the sense of "We found a way to use computers in a way we were not using them before".

      1.) Neural nets + MCTS beat us at all board games--board games that humans invented and can achieve mastery at. If a human Go player was born who could beat that version of AlphaZero, we would not say that person has solved intelligence.

      2.) Differential equations: also invented/discovered by humans, can also be solved by humans

      3.) Translate and see: See above, with the additional caveat that humans actually are better at translating and seeing than deep learning systems are.

      In addition, these were all achieved individually by systems with different architectures and massive amounts of training data that would amount to several human lifetimes. An 18 year-old human can play board games, drive a car, do differential equations, and learn multiple languages, with a single brain using a generalized structure and a fraction of the "training data" afforded to DL. This indicates to me that ML as a whole is still very far off the mark of General Intelligence.

      > Previously you might wonder how emotion, thought, sensation, imagination and will relate to each other. Now we can understand how they might be implemented and wired together, and what principles support their function.

      Previously? This is still an unanswered question. Show me where deep learning research has even come close to producing a system that can learn and adapt like a human mind does.

    • sifar 6 years ago

      >> Have we really not learned anything by these inventions?

      Not really, and that is the problem. We can create something that sort of works - but we don't understand it. The only thing we have learned is that we don't need to understand intelligence to build something that works on some tasks.

Irishsteve 6 years ago

It's the current trendy technology (For a lot of good reasons). So its only normal students gravitate towards it.

It's not ideal - but if it wasn't DL it would be another topic / application.

debbiedowner 6 years ago

It's a well funded research area and in many cases outperforms other solutions, so it's not going anywhere in Unis for the next decade at least.

It's not completely inscrutable in why it works, follow the thread on deep learning theory research by starting from the names here: http://www.vision.jhu.edu/tutorials/CVPR17-Tutorial-Math-Dee...

en4bz 6 years ago

> During the gold rush it's a good time to be in the pick and shovel business

AnimalMuppet 6 years ago

For your purposes, it doesn't matter. You can do your capstone in the current hotness, or not, as you please. If you do it in deep learning and in five years deep learning is passe, it won't matter. You'll have your degree and be four years into your career. Or you'll be four years into grad school. You'll be fine. (In this scenario, if your grad school is in deep learning, you won't be fine. Think harder about that choice than about your capstone.)

Given all that you've said, a capstone that tries to dent the "inscrutable nature" of deep learning might be an interesting choice.

omarhaneef 6 years ago

I think deep learning works better than simple linear regressions because we have already succeeded with simple linear regression wherever we can but we have just started to get going with "deep" learning. And the best part is, as new computers come out, the deeper we learn.

I will point out that the real win is with new data sources, and simple linear regressions may still work there.

burfog 6 years ago

Possible projects that might keep you occupied for the entire year:

Make a Rust front-end for the GNU Compiler Collection.

Emulate something.

Write a hypervisor.

sys_64738 6 years ago

"Deep learning" is the latest buzzword to get all the dollars nowadays. In a past decade corporations used to even fund Second Life for use at work.

Kenji 6 years ago

The most important thing is this: You pick a problem you want to solve, you pick the tools you want to solve that problem with, and one of the tools in your bag of tools is deep learning which you may end up using. Do it in that order. Do not pick deep learning and then try to solve everything with deep learning, that's putting the cart before the horse. That's all I can say about things like deep learning, blockchain, etc. Let the problems lead you to the solutions, not the other way around.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection