Meta Unveils New AI Supercomputer

156 points by davidstoker 4 years ago · 198 comments (196 loaded)

Reader

I can't shake the feelings that a trillion or a quadrillion parameters won't solve the fundamental shortcomings of ML models not being models of artificial intelligence. I guess there's no way of knowing until we reach AGI, but I've never heard a compelling argument for why pure ML would get us there. GPT3 seems more like an argument against that hypothesis (in my view) than for it. Even the best, most expensive models today are incredibly brittle for enterprise usecases that shouldn't necessarily require AGI.

I've always imagined AGI (perhaps naively) as being achieved by clever usage of ML, plus some utilization of classical/symbolic AI from pre-AI winter days, plus probably some unknown elements.

ddalex 4 years ago

I feel that for there are three requirements for a NN-based AGI, inspired by biology:
a). an internal feedback loop that evaluates a possible output without actuating it, and self-modifies the parameters if the possible output is not what it's needed
b). the capability (based on a) to model own behaviours without acting on them, and to model other agents behaviours and incorporate that model into the feedback
c). the ability to switch between modelling own behaviour and other agents behaviour intentionally by the model itself - as part of the feedback loop
i.e. what I feel it's totally missing in the self-driving cars today is the capability to model OTHER traffic participants actions and intentions; an experienced and attentive human driver does this all the time, pays attention to the pedestrians on the side if they want to jump in front of the car, pays attention to where other cars are LIKELY to go, pays attention to how the bicyclist that's currently overtaken may fall, even pays attention to random soccer balls flying out of a courtyard because a kid may be chasing that. I am not seeing any driving car trying to model any agent outside its own.
- joakleaf 4 years ago
  
  Cruise actually consider both social dynamics and uncertainty (i.e. what can hide behind an obstacle, or where are pedestrians/bikes/cars likely to move to).
  If you are interested in self-driving cars, I can highly recommend their presentation from November 2021:
  https://youtu.be/uJWN0K26NxQ?t=1467
  For me it felt more convincing than Tesla's (a few months prior);
  https://www.youtube.com/watch?v=j0z4FweCy4M
  - ddalex 4 years ago
    
    Oh I haven't have heard about Cruise up until now. Will follow them, thank you
- Jasper_ 4 years ago
  
  That's what gets me about self-driving cars. The road is a very social space, and follows social rules. Pretty much all of the communication and norms happening on the road are social ones.
  The thing that would convince me AGI is ready would be to play a convincing game of poker. Or join in on a conversation mid-way through, listen to it, and engage with it actively. Show that machines are able to pick up on social cues, understand them, and learn new ones. It's a high bar, yes, but it's in my opinion a prerequisite for a self-driving car that's able to share roadways with other cars, cyclists, and kids playing in the street.
  - curiousllama 4 years ago
    
    NNS can win at poker - one recently beat a bunch of pros. Games are great challenges but bad tests.
    The structure both makes them tractable and not as generalizable as we'd like. To your point, social interactions aren't nearly so structured.
    https://www.nature.com/articles/d41586-019-02156-9
  - endymi0n 4 years ago
    
    https://www.nature.com/articles/d41586-019-02156-9/
- rileyphone 4 years ago
  
  https://www.creativemachineslab.com/uploads/6/9/3/4/69340277...
  "A robot modeled itself without prior knowledge of physics or its shape and used the self-model to perform tasks and detect self-damage."
  - knodi123 4 years ago
    
    lol, "the morphology was abruptly changed" is the most coldly scientific description of an injury I've ever heard.
- arduinomancer 4 years ago
  
  Well the theory for the end-to-end image based self-driving models is that they are supposed to cover that.
  The reasoning is that given enough training data the system would know the pedestrian is going to jump out or the cyclist is going to fall just based on sheer volume of training examples. It would have seen that scenario tons of times in the image data.
  Whether that will actually work is the question though
- Nasrudith 4 years ago
  
  Personally I think that biology may be a flawed approach for most applications. Although the others arr worthy ends in themselves just for its role in understanding ourselves in a forensic archaeologist try to replicate sort of way, let alone any potential insights to biological brains.
  Biology is glacially slow in comparison and one of the advantages from computing is being fast.
  I believe that not modeling it is partially by design as a result of responsibility and blame frameworks. If you depend upon possible actions taken by others to be safe you are reckless. Extrapolating from current motions is more reliable than trying to profile everything. "They are moving towards the street at 3mph and 20 ft away, their vector will intersect with car, brake to avoid collision or accelerate enough to leave intersection zone before they can even reach us" seems a more reliable approach. It isn't like a kid will suddenly teleport into the road.
- mrkramer 4 years ago
  
  I doubt there will be AGI in our lifetime. Maybe some breakthrough happens but it won't be even close to human intelligence.
  - bduerst 4 years ago
    
    I dunno. I didn't think consumer-common machine vision was achievable in our lifetimes either, yet everyone has a phone that can do it.
    It's like all major tech breakthroughs - it seems impossible despite all the pieces being there, right up until someone puts them together.
    
    mrkramer 4 years ago
    
    Computer vision, image recognition, audio recognition, speech recognition were somewhat easy when Moore's law kicked in and when computer software industry emerged. But AGI is whole another beast. For general intelligence you need to have underlying infrastructure that runs it and guides it just like nervous system does for us people or like operating system does for computers. You can not for example glue together computer vision and speech recognition and call it intelligence when all it does is recognize what it sees and what it hears.
  - chillacy 4 years ago
    
    My observation with statements like this both for and against some event occurring is that you'd have to be very specific with the definition of "AGI" and "human intelligence", otherwise everyone ends up claiming they predicted the outcome correctly (e.g. ray kurzweil's prediction evaluations seem to me like an exercise in motivated reasoning)
mindcrime 4 years ago

I've always imagined AGI (perhaps naively) as being achieved by clever usage of ML, plus some utilization of classical/symbolic AI from pre-AI winter days, plus probably some unknown elements.
For what it's worth, this is my view as well. And I don't think it's particularly naive. Plenty of people have researched and/or are researching aspects of how to do this. But how to combine something like a neural network, with it's distributed (and very opaque) representations, with an inference engine that "wants" to work with discrete symbols is non-obvious. Or at least it appears to be, since nobody apparently has figured out how to do it yet - at least not to the level of yielding AGI.
but I've never heard a compelling argument for why pure ML would get us there.
The simplistic argument would be that ML models are, in some sense, trying to replicate "what the brain does" and it stands to reason that if your current toy ANN's (and let's be honest - the largest ANN's built to date are toys compared to the brain) are something like the brain, then in principle if you scale them up to "brain level" (in terms of numbers of neurons and synapses), you should get more intelligence. Now on the other hand, anybody working with ANN's today will tell you that they are at best "biologically inspired" and aren't even close to actually replicating what biological neural networks do. Soo... while people like Geoffrey Hinton have gone on record as saying that "ANN's are all you need" (I'm paraphrasing, and I don't have a citation handy, sorry) I tend to think that in the short term a valid approach is exactly what you suggested. Combine ML and use it for what it's good at (pattern recognition, largely) and use "old fashioned" symbolic AI for the things that it is good at (reasoning / inference / etc.)
Now, to figure out how to actually do that. :-)
- space_fountain 4 years ago
  
  It seems quite clear to me that human brains are not actually doing much symbolic logic. What symbolic logic we do do has been bolted on using other faculties. I think the problem is that reasoning about our own minds is incredible tough. We want there to be some sort of magic sauce to what makes us, us and so we reject things like ANN's that seem somehow too simple. I think it probably is right that we won't just be able to scale up the number of parameters and get human like performance. There are hints that returns start to level off, but I'm also unsure why people are so sure we can't.
  - mindcrime 4 years ago
    
    It seems quite clear to me that human brains are not actually doing much symbolic logic. What symbolic logic we do do has been bolted on using other faculties.
    I agree. But my interest is in engineering something that works, not necessarily in creating an exact replica of the human brain. That's why my interest falls into the domain of symbolic / sub-symbolic integration - because it strikes me as a faster path to more usable computer intelligence.
    I have no problem believing that a sufficiently large ANN, with the right training and inference algorithms, could achieve AGI. My problem is that A. right now achieving that seems very out of reach to me (but I could be wrong) and B. it seems unnecessary to me to remain wedded to the idea of 100% (or even 90% or 80% etc.) fidelity with our biological brains. After all, if we want something just like a human brain, we just need a man, a woman, and 9 months of time.
    Anyway, I think it's OK to think of engineering in "short cuts" by using things we know computers are good at, and things we already know how to do, and trying to combine them with ANN's in such a way as to make something useful. Will it ever yield AGI? I have no way of knowing. And even if it does, would that approach actually be faster than a pure ANN approach? Again, I don't know. But for now, I spend my time on symbolic/sub-symbolic integration nonetheless.
    I think the problem is that reasoning about our own minds is incredible tough.
    Yes, definitely.
    
    space_fountain 4 years ago
    
    Very fair takes. I could certainly imagine elements being pulled in. For example things like alpha zero are to my understanding already coupling things like tree searches to neural nets. I sort of expect that any general solution would include some of that, but symbolic approaches seem to consistently do worse despite lots of people thinking they won't and plenty of money to be made. I think part of the problem is that what we want with AI is to interface with humans, and humans are using something fuzzy to understand the world so trying to model that rigidly will be hard
- edgyquant 4 years ago
  
  Even if they did replicate how the brain works our brains aren’t one of these networks trained for specific things it is millions, maybe billions, of them combined.
  - mindcrime 4 years ago
    
    Indeed. The learning / training we do today for ANN's clearly isn't what humans do. So yeah, even if we had billion "neuron" ANN's that were more biologically plausible, we'd probably still have to figure out more about how human learning works, in order to come up with the right way to train the AI.
mFixman 4 years ago

AGI seems hard because each year more and more problems that were previously considered close to AGI are solved.
Playing Chess at a grandmaster level was considered something only a human could do until the 1990s, and now no human has beat the best computer in 17 years while AGI seems further away than ever.
Mark my words: we'll create an AI that can pass the Turing test this decade, but we'll still be as far away from the badly defined general problem as we ever were.
- lostmsu 4 years ago
  
  The chess example is not that strong: "the best computer" or more precisely the software that beats humans since 1990s was actually specifically designed to beat chess. That was the case until AlphaZero did the same in 2017 for the whole class of turn based games.
  - lostmsu 4 years ago
    
    To add to that, it is quite possible that AlphaZero is already a general intelligence. Specifically, it may be that given some robotic manipulation, and some goals in real world, and lots and lots of tries (tens or hundreds of millions) it may beat an average human in "life".
MR4D 4 years ago

I agree. I read Jeff Hawkins book On Intelligence [0] back when it came out, and it had a profound effect on my thinking. Chasing more data, aka "parameters" doesn't seem to be the right answer. I think more of a Bayes model like spam filtering, but cobbled together with other Bayes models looking at other things until something emerges that we call "intelligent". Heck, I'd consider Google's spam filtering pretty intelligent today.
[0] - https://en.wikipedia.org/wiki/On_Intelligence
- jcims 4 years ago
  
  Hawkins way of thinking really maps well for me also. It seems like that more parameters helps until it doesn't, then you need to encapsulate those networks and pin them to some reference frame, they create hierarchies of these networks and a system to generalize and compress those hierarchies (aka patterns), rinse and repeat.
  My brother just became a grandpa and I was watching his grandson navigate the world this past weekend. It's unbelievable how quickly the brain can extrapolate a new relationship between objects/actions/etc and then apply it elsewhere. Minimally you see it in the drinking action applied to all sorts of things, this sort of repetitive clenching/releasing of the fingers to find things to grip without looking, etc etc. Watching mom use a fork and very quickly understand how to grasp and manipulate it. The model of just training everything from exogenous data into a flat network seems like it will hit some asymptotic limit.
rafaelero 4 years ago

Scaling hypothesis says that we just need more processing power to achieve things we regarded as "impossible for non-intelligent agents". So far, scaling hypothesis is proving itself correct despite still prevailing skepticism.
- Guest42 4 years ago
  
  In my experience, the data becomes quite limiting.
dekhn 4 years ago

it would be pretty embarrassing (or relieving?) if it eventually turned out there was nothing special about human intelligence, just that we crossed some threshold of neurons and other brain bits to ("a few quadrillion parameters") to convincingly fool ourselves that we are self aware, have agency, and do anything "intelligent" (other than some fancy stuff that looks like the physics/biology equivalent of state of the art ML).
I am a proponent of using a working theory that intelligence is an emergent property and we can in principle create new intelligences in a lab (or ML warehouse) if we provide the proper conditions, but that finding and maintaining those conditions is extremely hard. Some state of the art research today aims to integrate recognition capbilities (image recognititon and object detection/tracking on video, voice extraction from audio, text) with advanced generative models for language and behavior, as well as realtime rendering systems that can create realistic humans.
if we combine those we can make a bot that appears fully interactive, passes all turing tests, convinces typical person it's another person... and still has nothing inside researchers would call "artificial intelligence". It might even solve science problems that we can't without having any spark of creativity or agency. Or maybe when we make a bot with all those properties, some uncanny valley is crossed and out pops something that has objective AGI?
As the wise robot once said, "if you can't tell the difference, does it really matter?". We should forge ahead with building datacenter-scale brains and feed them with data and algorithms, while also maintaining a cadre of research scientists who are attuned to the ethical challenges of doing so, an ops team trained to recognize the early signs of sentience, and an exec team with humanity.
stereolambda 4 years ago

I'd say that the view against ANN gives humans (especially researchers) more "dignity", in the sense that we still need to figure out some deep stuff and not just add hardware. I wouldn't treat this as an argument either way, just an observation.
Heuristically, we came to be by a very dumb process of piling up newer generations. If my pet would communicate with me on the level of GPTx, I would be very impressed. That's why nowadays I have some scepticism for the ANN critics' arguments, though think it would be neat if they were right.
The thing that I dislike the most in these discussions is the pervasiveness of the AGI concept and the assumption of a linear scale of intelligence. Again, I can intuitively say that I'm more intelligent than my pet: but to quantify this, we'd need to use something silly like brain size, or qualitative/arbitrary things like "this being can talk". I think that human intelligence is a somewhat random point in a very multi-dimensional space, one that technology may never even have a reason to visit. But people tend to subscribe to the notion that this is the very important "point where AGI happens".
- tsimionescu 4 years ago
  
  > If my pet would communicate with me on the level of GPTx, I would be very impressed.
  GPTx is not communicating with anyone. It is generating text that resembles text it had in its training set. The fact that human text is normally a form of communication doesn't make generating quasi-random text communication in itself. GPTx is no more communicating than a printer is when printing out text.
  A cat or dog leading you to their empty food bowl is actual communication, and they are capable of much more advanced communication as well (especially dogs). The fact that it doesn't look like written text is not that relevant. They are of course worse than GPTx at producing text, just like they are worse than a printer at writing it on a blank page.
  - emteycz 4 years ago
    
    I wonder how well would a dog+GPT/transformer combo work.
ravi-delia 4 years ago

I'm most of the way towards agreeing with you, but I think you underestimate how far you could get without any major changes. Most of the brain consists of feed-forward processing, and what closed loops exist are probably replacements for backprop rather than essential to cognition. That's all the low level processing, from visual to motor. Now obviously we have higher level processing too, and it might be super weird! But no model we've made comes close to the size of even specialized brain regions, and study after study has demonstrated the power of the subconscious mind. Once we have big enough models, we might find out that all we need to take it to that final step is a while loop.
rawtxapp 4 years ago

Does it really matter it? If this new supercomputer means that ML engineers can iterate x% faster which in turn increases FB's profits by even a small y%, I would think this would have already paid for itself.
spoonjim 4 years ago

Not all applications require a general intelligence.
defaultprimate 4 years ago

We will never achieve AGI
mbrodersen 4 years ago

We are so far from having “real” AI that it is amusing to me every time I read yet another article gushing over ML. ML is fundamentally pattern matching. It is impressive tech for what it does. But humans doesn’t need 1 million carefully tagged images of chairs or cars to work out what a chair or car is. Our understanding of what general intelligent is hasn’t progressed much since the last AI winter. The only real difference is that computers are much faster today, enabling old technology ideas to be fast enough today for practical use.
- mbrodersen 4 years ago
  
  Wow so much downvoting. But no serious counter arguments.

reggieband 4 years ago

I hate that I always end up referencing the Lex Friedman podcast but it is often relevant to discussion on HN. Recently Lex spoke with Yann LeCun and they had a brief chat about AI at Meta/Facebook [1] (where I believe Yann is currently Chief AI Scientist). He claims that AI is the core of Meta and that if you were to take ML out of Meta systems the company would literally crumble because it is completely built around AI.

My feeling is this is a PR push by Facebook. All tech companies keep touting AI, especially Google but also Microsoft, Apple and Amazon. In some sense I believe these business want to control how their own success is defined. That is, they are right now convincing everyone that tech dominance is equivalent to AI dominance which is equivalent to ML dominance. In some sense this is turning into a purity test, like "which tech company is the most AI focused". I expect this kind of PR to accelerate as each company tries to prove its AI bona-fides to the market.

1. https://youtu.be/SGzMElJ11Cc?t=6597

acchow 4 years ago

These aren't startups trying to prove "AI Purity" for more funding. These are money printers that are optimizing how to print more money. And Facebook and Google are competing against each other for online advertising dollars (yes, the pie is also growing). Their revenues are up 60% and 40% over the past 2 years, so I doubt their AI plays are just about proving some purity game.
6gvONxR4sf7o 4 years ago

I haven't listened to the podcast, but the way I read a statement like that is, "Our business needs to do things that aren't as simple as defining manual rules, but the economics of our business prevents us from just paying people to do those things at scale."
mrkramer 4 years ago

There is no single AI company. These are all machine learning techniques. AI is spreadsheets on steroids not real intelligence.
throwaway423342 4 years ago

As a side note:
I listened to the episode with Yann. Compared to other talks (e.g. the previous one with Brian Keating) it was a bit dull and uninteresting. The answers were not that insightful.
jcims 4 years ago

>I hate that I always end up referencing the Lex Friedman podcast but it is often relevant to discussion on HN.
I do the same thing and feel the same way, like I'm astroturfing or something. If its any consolation, I don't remember ever seeing your references and I hope you don't remember mine.
strikelaserclaw 4 years ago

I mean what else could he have said? People rarely speak the complete truth if their reputation of paycheck is on the line.

kleiba 4 years ago

I used to work at a university where my professor had been in automatic speech recognition for a long time, but basically gave up on that line of research about 10 years ago because he figured that universities simply cannot compete budget wise with the big industry players.

I suppose the same will soon be true for most ML-related areas of research sooner or later, at least as far as applied ML is concerned.

Already, a substantial amount of research innovation in NLP and CV has been coming from big companies in recent years.

Of course there is a discussion to be had about what that means for society at large. At this point, a lot of said companies to publish their results at conferences etc. But what if at some point they decide to be as "open" as OpenAI (ie., not)?

Mageek 4 years ago

Well, universities can’t compete in things like car production or rocket manufacturing but find ways to contribute nonetheless. Researchers have and always will struggle to get resources relative to BigCorp - AI is just joining the party. Daimler and Lockheed are no more open than Facebook is, AFAIK. There is still plenty to do and to analyze. Verifiable AI, more efficient models, knowledge transfer, 1000 brains, human interpretability, etc.
rococode 4 years ago

I think the academic side will start shifting towards research on efficiency and speed while companies will continue to push the cutting edge.
In the NLP space there's been a lot of work recently around reducing model sizes, since they've started to reach the point where model weights sometimes don't fit in the memory of most GPUs.
There's also projects like MarianNMT which completely abandon Python and write heavily optimized models with fast languages that can run quickly and accurately even without GPUs. I think we'll see a lot more of this, though of course there's a pretty big barrier in the sheer rarity of being good at both deep learning research and writing optimized low-level code.
- Nasrudith 4 years ago
  
  It would be a bit ironic for universities to compete on efficency and speed given those are two things companies optimize on. Not impossible of course, theory and encouragement to a bit more abstract could lead to providing that.
  As for writing low level code, I thought that was something usually handled by the compiler or where even the advanced high performance for high price mostly tweaked the compiler after analyzing the output. Not my direct space so I speak with no authority.
- knodi123 4 years ago
  
  > I think the academic side will start shifting towards research on efficiency and speed
  Constraints are the mother of creativity.
- bluenose69 4 years ago
  
  Julia is not hard for a Python programmer to pick up, and it can be very fast.
riazrizvi 4 years ago

Academia is not necessarily the pinnacle of achievement in a field, it is the pinnacle of published achievement. There is always a dual track of proprietary knowledge, and knowledge available to the commons. Since the latter is most beneficial to society, that’s why we have awards for people when they publish their research, instead of hiding it for maximum profit.
I don’t see something new here, these institutions to encourage people to share are old, so it must be a problem that had been recognized for a while.
paxys 4 years ago

Even historically, how much cutting-edge research for commercial tech has come from universities? I'd say government-backed labs, military and private corporations have all always had a greater impact.
0x4d464d48 4 years ago

"But what if at some point they decide to be as "open" as OpenAI (ie., not)?"
Aisde from some of the academics and the "gain and share knowledge for knowledge's sake" types they hire why would they care?
For the record, I don't like the idea of scientific research becoming proprietary. At all. But is there anyone credulous enough to think these organizations would willingly risk their bottom line for principles like "openess" and not just play the PR games to make themselves appear open and concerned?
In other words "Don't LOOK evil but do evil when no one's looking".
The Frances Haugen already shows how damaging such openness can be.
stathibus 4 years ago

I hope a positive outcome of this will be that universities direct more of their research effort toward efficiency of network architectures and/or understandability.
- ml_hardware 4 years ago
  
  Unfortunately it will be hard to investigate properties of large, powerful neural networks without access to their trained weights. And industrial labs that spend millions of dollars training them will not be keen to share.
  If academics want to do research on expensive cutting-edge tech, they will have to join industrial labs or pool together resources, similar to particle physics or drug discovery research today.

karmasimida 4 years ago

> Meta’s AI supercomputer houses 6,080 Nvidia graphics-processing units ..... By mid-summer, when the AI Research SuperCluster is fully built, it will house some 16,000 GPUs

Honestly ... this is lot of GPUs ... but is it the biggest...?

> Model training is done with mixed precision on the NVIDIA DGX SuperPOD-based Selene supercomputer powered by 560 DGX A100 servers networked with HDR InfiniBand in a full fat tree configuration. Each DGX A100 has eight NVIDIA A100 80GB Tensor Core GPUs

So Nvidia used 4480 GPUs to train Megatron-Turing NLG 530B for example.

vl 4 years ago

Honestly, this single GPU-based install is child's play compared to Google's multiple TPU exoflop supercomputers with hyper-cube optical interconnects. Google's ML setups allow synchronous weight update on thousand+ TPUs...
- rawtxapp 4 years ago
  
  TPUs are amazing, but in my experience, debugging issues with them can be a bit tricky. Since nvidia's gpus are more common place (especially outside gcp), you can find a lot more information when you get stuck, it's also more battle tested, etc.
  - 6gvONxR4sf7o 4 years ago
    
    For what it's worth, jax is helpful to me here. You can drop out of the jit to debug it as if it were numpy.
    Of course that assumes your issues aren't with the jit itself or inside pmap, etc. That shit's hard.
- alex_sf 4 years ago
  
  Tbh I thought I was being trolled with 'hyper-cube optical interconnects'.
  - vl 4 years ago
    
    Actually, you are right, I mistyped. Although hypercube interconnects exist, and were used, for example, in AS400, system in question uses hypertorus topology.
- cm2012 4 years ago
  
  For what its worth, for attention based advertising (youtube and display, not search), FB targeting blows Google out of the water. Not sure why but its consistent across brands.
  - mchusma 4 years ago
    
    I have seen this myself, by I'm unsure if it's just a "ad quality" thing. For example, I can target exact placements on YouTube for my exact niche, and broad Facebook matching will outperform. I have tried YouTube and display for months with nothing within an order of magnitude as effective as Facebook.
- dekhn 4 years ago
  
  for TPUv3 it's 2D torus, not hyper-cube, right? Not sure if TPUv4 topology is externally published, but IIRC hypercubes are basically never used any more.
  - vl 4 years ago
    
    I mistyped, one version is 2D torus, next is 3D torus aka hypertorus.
sailingparrot 4 years ago

At 16k it will definitely be the biggest.
As for today, Nvidia has this a very slightly smaller cluster that you outlined at ~5k, Microsoft as a few of them roughly of that size, and Microsoft also built a 10k GPU cluster for OpenAI 2 years ago, but those are V100 GPUs.
So, is 6k A100 "bigger" than 10k V100? Depends exactly how you use them, in a perfect usage scenario yes, slightly. In real life maybe not.
- dekhn 4 years ago
  
  Systems like this are designed to reach nearly peak performance (IE # of flops per processing element * # of processing elements), explicitly by making a network that won't block or increase latency for the common expensive operations (allreduce, allvall) at the expensive of greatly increased cost.
  The point of making this machine is to have a lot of A100s going at the same time, and that will unblock some small set of researchers who are working on time-sensitive competitive research projects by giving them a slightly throughput and latency advantage on the largest problems. The vast majority of users would be better served by a small number of cheaper, slower GPUs that they had exclusive access to for the longest time period they could afford to wait.
  - sailingparrot 4 years ago
    
    > Systems like this are designed to reach nearly peak performance
    The system certainly is. The code running on that system generally isn't. Pulling 100% of the FLOPS the GPUs are able to provide is quite hard.
    And my point was it also depends on the specific models you are training. Are you training a transformer model in FP32 precision? Then yes, 6K A100 will blow 10K V100. Are you training a ConvNet in FP16? Then no, 10K V100 will perform better.
    The GPUs have different architecture, you have to use the architecture best suited for the A100 to achieve the speedup marketed by NVidia, which is presumably the number FB is using to claim that their 6k GPU cluster is bigger than OpenAI's 10K one.
buildbot 4 years ago

Probably not, as Azure was at 10K last year: https://blogs.microsoft.com/ai/openai-azure-supercomputer/

bearjaws 4 years ago

  “The experiences we’re building for the metaverse require   
  enormous compute power…and RSC will enable new AI models 
  that can learn from trillions of examples, understand
  hundreds of languages, and more,” Meta CEO Mark Zuckerberg

I don't really understand how AI processing is going to make the 'experiences' any better? This seems to me like investor fluff, saying they have some insane capability that other 'VR providers' don't have...

rococode 4 years ago

I think there are plenty of possibilities:
- 3d worlds with style transfer on the textures, like maybe there's a cafe with the visual style of Starry Night or something
- NPCs with conversation models that are finetuned for each NPC's personality and saves some history for each person it talks to for continuity
- Game-playing AI on NPCs that make them go around doing actual things or playing minigames with players
- The usual user tracking models, figuring out what people like to do in the metaverse and giving them more of that
- All the lower-level stuff that AI can do better - user inputs, rendering, etc.
Whether or not they can pull it off is a separate question - I think the tech is close but not quite there yet - but there's no doubt that the metaverse concept of "an expansive virtual world with lots of fun things to do" has many ways to use huge amounts of computation.
- Bombthecat 4 years ago
  
  Loot boxes with special designs just for you, modeled, picked, designed and coloured after your taste.
mark_l_watson 4 years ago

I am not sure either, but I worked on “game AI” over 20 years ago for Nintendo and Disney, and I am 100% sure that I could have used deep learning to good effect if it had been available. In the past seven years, I have been using mostly LSTM and GAN, and recommender models, BTW.
forgotmyoldacc 4 years ago

Example 1: GPT-3 is a decent chatbot. Training a similar model so you can have a conversation with AIs in the "metaverse" (god, that word is terrible) could be fun / useful.
Example 2: Using AI upscaling (like Nvidia) to improve visual fidelity in games.
Example 3: Hand/body tracking for avatars.
The more AI compute, the more experimentation researchers can do.
- nerdponx 4 years ago
  
  I guess "AI in the metaverse" sounds a lot better than "machine learning in Meta Inc's new VR platform".
  - giantrobot 4 years ago
    
    VR advertising and tracking platform.
- d1sxeyes 4 years ago
  
  One particularly interesting piece of tech I've seen is Nvidia's AI Video 'Compression'.
  In summary, rather than actually streaming video to the person you're chatting with, you send a keyframe, and then 'compressed' video is sent over the wire, and 'decompressed' at the receiver end.
  I'm putting 'compression' in quotations because to me I'm not sure I'm comfortable calling it compression. Basically, you're remotely controlling an avatar of yourself.
  While the obvious usage of this is reducing bandwidth used (in their example, an h264 stream at ~100KB/frame can be compressed to 0.1KB/frame, literally a thousandth of the bandwidth), it opens up some VERY interesting possibilities for a company like Meta (check from about 1:55 onwards in the video below).
  You can view someone's face from any angle, not just the angle they're speaking from (as you might in a VR world), or you can even map the key points onto a completely different keyframe, allowing for hyper-realistic avatars or next-level virtual backgrounds (imagine: you send a keyframe of you sitting at your desk and hop on a video conference from the beach, and no-one's any the wiser as long as the sea is quiet enough)
  https://developer.nvidia.com/ai-video-compression
  - sbierwagen 4 years ago
    
    A Fire Upon The Deep (1992):
    >Fleet Central refused the full video link coming from the Out of Band … Kjet had to settle for a combat link: The screen showed a color image with high resolution. Looking at it carefully, one realized the thing was a poor evocation…. Kjet recognized Owner Limmende and Jan Skrits, her chief of staff, but they looked several years out of style: old video matched with the transmitted animation cues. The actual communication channel was less than four thousand bits per second
    >The picture was crisp and clear, but when the figures moved it was with cartoonlike awkwardness. And some of the faces belonged to people Kjet knew had been transferred before the fall of Sjandra Kei. The processors here on the Ølvira were taking the narrowband signal from Fleet Central, fleshing it out with detailed (and out of date) background and evoking the image shown. No more evocations after this, Svensndot promised himself, at least while we’re down here.
smoldesu 4 years ago

I'm hardly an advocate for any of Facebook/Meta's actions over the past... however long, but a lot of people forget that "Metaverse" doesn't just mean "virtual reality". VR could be a large component here, but the biggest goal of the Metaverse is really to map physical things into a digital world. That data can be used in any number of ways, not just VR/AR; it could be used to provide 3D models for common shopping goods in the Walmart app, give meteorologists an interactive forecast maps, map GitHub repositories to technology that you use every day... the list goes on. The real goal is ripping digital metadata from real-world objects, which could indeed be inferenced like an AI model for any number of uses.
rytill 4 years ago

For one, real-time avatars make heavy use of “AI processing”. https://research.facebook.com/videos/audio-and-gaze-driven-f...
jayd16 4 years ago

If computer vision falls under AI then its pretty obvious why it would help with AR and world sensing.
some_furry 4 years ago

Any Virtual Reality where I don't have the option of being a talking blue anthropomorphic dhole (i.e. my fursona) isn't one that I'll ever choose to adopt. Calling it a "metaverse" doesn't affect my decision here.
cm2012 4 years ago

Zuck has vision, especially for what people will want to use. I am looking forward to what FB comes up with here.
- 52-6F-62 4 years ago
  
  But his vision, in summary, is to dictate a world view. I am not looking forward to what they come up with.
- giantrobot 4 years ago
  
  Do you get paid in MetaBucks or a real currency? What are the hours and benefits like? Does Zuck wave out a window at you in lieu of cash bonuses?
  - cm2012 4 years ago
    
    I get that it's a very unpopular opinion on HN. A bit of why I post comments like these is to have at least one counterpoint in HN threads that are generally anti Facebook. I am genuinely bullish on the prospects of Facebook as a company based on everything I've seen.
  - dvsfish 4 years ago
    
    I'd rather these absurd empire businesses to at least do something interesting with their empires instead of just make it a bit easier to consolidate money. Zuck is uninspiring and hard to like, but at least this move is somewhat visionary.
tikimcfee 4 years ago

It won’t make it better - it makes it more cost efficient to throw random numbers at a random number optimizer to increase the number of times they can report someone clicked or saw an ad. That’s it, end of story.
The value ad is that the engineering community that they employ has a job, the stock stays higher because of their perceived value add to the tech, and the push to control data continues unburdened by something as trivial as a lack of compute power. Hooray. Progress.
munchbunny 4 years ago

In practice it'll probably look like indirect improvements in the infrastructure that devs get for computer vision/natural language and other miscellaneous model training stuff.
A lot of this stuff is like trickling tech from F1 teams down into consumer cars. Some of the tech will likely end up in commodity datacenter/cloud stuff.
macrolocal 4 years ago

Hm, maybe they're optimizing DLRM models, since inefficient HW communication sometimes bottlenecks Facebook's data center performance for them [1]. The improvements would be better personalization, ie. monetization.
[1] https://arxiv.org/pdf/2104.05158.pdf
plafl 4 years ago

There may be some possibilities. I'm not sure if it counts as AI but nevertheless a nice video:
https://m.youtube.com/watch?v=BTETsm79D3A
There is never enough compute power. Dwarf Fortress on a supercomputer?
c7DJTLrn 4 years ago

In the past they've used machine learning to do a kind of blurring I can't remember the name of to make scenes in VR look more realistic. They are building a lot of ML models. I think the future of VR is almost intertwined with machine learning.
rNULLED 4 years ago

Consider reading https://ai.facebook.com/blog/ai-rsc
impulser_ 4 years ago

Creating a 3D world with limited amount of data.
maydup-nem 4 years ago

> understand hundreds of languages
> understand
I know this is CEO-talk, but I sometimes wonder if these pricks really think they are inventing AI.
zwaps 4 years ago

ît is not about making experiences better, it's about modeling behavior as to sell stuff
- varelse 4 years ago
  
  Those models are surprisingly tractable. You're nowhere near as interesting and unique as you might think you are at scale.
  Evidence: actual work experience at building latent representations to characterize customer behavior at FAANG. It's hard to come up with something that really gets you, but it's not hard to come up with something likely to make you spend more. You're surprisingly predictable on that axis and even if you aren't because you put the hours into being a crazy outlier, almost everyone else is, and you don't matter.
  - convolvatron 4 years ago
    
    I wouldn't think it would take putting in any hours to be a crazy outlier - just a markedly different value system.
    
    varelse 4 years ago
    
    Which itself requires hours of listening to alternative influences in order to develop, no? The pressure to conform within 2 sigma is strong in almost any society IMO.
    What people don't seem to grasp is that most of the supposed alternatives to mainstream are pretty mainstream too or we wouldn't have stores like Hot Topic in the first place.

ctoth 4 years ago

Anybody else get the sense that we're just totally frickin doomed? Even if Yudkowsky is off about AGI (which is a big maybe!) in what possible world will this technology be used to make our individual lives better (assuming you're not a FAIR researcher?)

hooande 4 years ago

In theory, this could be used to create AI agents that can process visual/audio information in a way that's more similar to humans. It could lead to household robots or advanced conversational interfaces. or whatever the hell the metaverse is.
It's just bunch of GPUs. It could be used for anything people can imagine, good or bad
- lm28469 4 years ago
  
  Go back in time and tell people about the technical power we have in 2022. Then tell them to imagine what we do with it. And finally explain to them that the vast majority of it is used to sell us goods and services we mostly don't need.
- knodi123 4 years ago
  
  > It could lead to household robots or advanced conversational interfaces.
  Yes, but anything that could do that, will be used for military robots and context-aware ubiquitous comms surveillance.
  > It's just bunch of GPUs. It could be used for anything people can imagine, good or bad
  And nuclear power can be used for good or ill, too. But when the ills grow big enough, it's still fair to worry about proliferation and possible end-of-civilization events. It's unhelpful to reassure someone building a bomb bunker "Don't worry, nuclear power is just a tool, it can be used for good OR for bad".
- mwcampbell 4 years ago
  
  > create AI agents that can process visual/audio information in a way that's more similar to humans
  Which, of course, is great for accessibility.
aierou 4 years ago

Have you ever heard of AlphaFold?
[0]: https://deepmind.com/blog/article/alphafold-a-solution-to-a-...
- schleck8 4 years ago
  
  Or RosettaFold
  https://github.com/RosettaCommons/RoseTTAFold
meetups323 4 years ago

More people trapped inside on the metaverse = smaller crowds at outdoor recreation areas?
- emerged 4 years ago
  
  I think the real world will become populated with android avatars which are controlled by people from their VR headsets.
  So you’ll have small human crowds but loads of anonymous avatar androids taking all the good fishing spots, riding the trails backwards, etc.
  I’m joking hopefully
- Traubenfuchs 4 years ago
  
  That would be a win-win for everyone.
  - doublerabbit 4 years ago
    
    Why?
    
    Traubenfuchs 4 years ago
    
    VR enjoyers can enjoy VR, reality enjoyers can enjoy reality.
    Vienna, where I am living, is completely overrun by people. Public spaces and transport at peak hours have become a mess in the last years. The first lockdown showed how beautiful the city can be if everyone stays home.
Atlas667 4 years ago

Yep, this will be used for precision tracking and context categorization. And of course, to more deeply understand human motivators and make platforms even more addictive to us. This will have minimal benefits to the working class and all the benefit to Meta and their clients.
I fear the ammount of human information this AI is going to be free to analyze from Facebook and what it will deduce about us and then how Meta will use it to generate capital.
UncleOxidant 4 years ago

Yes, I think we're in trouble. Walked over to friends' house over the weekend. It was a very nice, sunny weekend. Saw them through the window with their goggles on, rang the doorbell. They came to the door and I said, look how nice and sunny it it, we should go for a walk and they were like "oh, we had no idea it was so nice out because we were in Meta with our goggles on".
- colinmhayes 4 years ago
  
  I don't see how this is a bad thing. What's wrong with preferring the metaverse to their sunny neighborhood? Honestly the matrix seems like the end game here and I'm all for it. Then it can be a sunny day in the neighborhood every day. The doom comes from the matrix being built by people who either don't have our best interests in mind, or aren't competent enough to build the system to correctly act in our best interests.
  - yumraj 4 years ago
    
    > The doom comes from the matrix being built by people who either don't have our best interests in mind ...
    You mean like Zuckerberg and FB?
    
    ModernMech 4 years ago
    
    What makes you believe that Mark "I can't believe those dumb fucks trust me" Zuckerberg doesn't have our best interests in mind?
    
    giantrobot 4 years ago
    
    Just a hunch...
  - dominotw 4 years ago
    
    vitamin d deficiency
LightG 4 years ago

Probably. I'm going to set up all my data so that, anytime it's parsed by Facebook, there'll be a middle-finger waiting for their supercomputer with just two words: "Process this".
cblconfederate 4 years ago

biology is hard, perhaps too complex for human brains to make a decent model of it. AI can solve it and try to explain it to us (if it pleases her)
hourislate 4 years ago

We're not all doomed, just the NPC's who embrace it.

pohl 4 years ago

"You're going to be eaten by a bronteroc. We don't know what it means."

mawadev 4 years ago

If this won't make people watch ads 24/7, then what will?

exdsq 4 years ago

How far can we actually take current machine learning technologies by scaling the underlying hardware? Are we going to see some AI algorithms that are 20% better or an order of magnitude better? And what will that realistically look like to an end user? This will have cost a lot of money and maybe the news alone will push stock prices and mean its paid for itself but is it actually going to result in a substantially better product?

jazzyjackson 4 years ago

I was just in a Twitter Spaces room and they have a live transcription feature, so as to be accessible and all, except the transcript was gibberish. If Facebook wants live translation in the Metaverse, they should hope this brings orders of magnitudes improvement to voice recognition, especially in languages other than english (by far the largest training set available)
- zydex 4 years ago
  
  I obviously don't know the parameters of the room you're referencing, but is it possible that the majority of the issue is on the side of poor user audio and a large number of simultaneous speakers? I find YouTube's transcription to be quite impressive with a handful of speakers and moderate audio quality.
ml_hardware 4 years ago

You may find this blog post useful for thinking about AI scaling: https://www.alignmentforum.org/posts/k2SNji3jXaLGhBeYP/extra...
For general tasks like language modeling, we are still seeing predictable improvements (on the next-token-prediction loss) with increasing compute. We will very likely be able to scale things up by 10,000x or so and continue to see increasing performance.
But what does this mean for end users? We are probably going to see sigmoid-like curves, where qualitative features of these models (like being able to do math, or tell jokes, or tutor you in French, or provide therapy, or mediate international conflicts) will suddenly get a * lot * better at some point in the scaling curve. We saw this for simple arithmetic in the GPT-3 paper, where the small <1B param models were terrible at it, and then with 100B scale suddenly the model could do arithmetic with 80%+ accuracy.
Personally I would not expect diminishing returns with increased scale, instead there will be sudden leaps in ability that will be very economically valuable. And that is why Meta and others are so interested in scaling up these models.
arnaudsm 4 years ago

It's linear for now (check GPT-2 vs GPT-3), but we're close to the point of diminishing returns.
- bcaine 4 years ago
  
  It's actually not linear, its a power law. That means we need exponentially more compute, data, and model parameters to see linear improvements in performance.
- mindcrime 4 years ago
  
  Part of the problem though, is that we don't know for sure what non-linearities may be lurking out there. Maybe we add 100 more "neurons" to the net and it "goes exponential" so to speak. Or maybe not. There's still a lot we don't know about the emergent properties of these systems as they scale up.
Buttons840 4 years ago

I think these things scale sub-linearly

colechristensen 4 years ago

What is the difference between “a supercomputer” and “a bunch of racks of computers”?

The actual difference between the two is quite diminished compared to years past and seems to reduce more to how a collection of computers is used and not what it is.

tyingq 4 years ago

The big remaining one appears to be an unusually high speed interconnect. Infiniband, etc.
- lmeyerov 4 years ago
  
  Yep, hetero multigpu fleet mixing high ram GPUs (40-80GB each on each A100) as multigpus w smaller (ex: ~12-16 GB T4s) nodes, w crazy interconnects locally (nvlink) and across nodes. And storage gets fun as well, like parallel SSD arrays for 100GB+/s combined per node. Then whatever legacy+hybrid CPU stuff. Ex: for stuff like PCIe, new generations that ~10x the bandwidth you'd see in a gamer box, and like 1-2 per GPU. Varies a lot for say log mining vs NN training, and even for diff NNs. Ex: Graph NNs end up needing more balanced CPU side.
  Saturating a box with 500+ GB GPU RAM is fun. Only our gov users ask us for help on that typically: most of our users are commercial nowadays, but with much smaller/scaled down GPU rigs. I think that'll change as the fintechs keep improving and software gets easier, but they are still not there (outside of niches). Working on it :)
  (If you like writing shaders, we are hiring :D )
benstrumental 4 years ago

> What is the difference between “a supercomputer” and “a bunch of racks of computers”?
In addition to the other responses, I like pointing people to this talk[1] by Jeff Hammond for a comprehensive answer to this question (you can skip to the 11:15 timestamp).
[1] https://uchicago.hosted.panopto.com/Panopto/Pages/Embed.aspx...
- paxys 4 years ago
  
  That talk is from 2009 though. Nowadays companies regularly run jobs on commercial data centers which can include thousands of GPU cores, Infiniband networking and other specialized equipment. One can make a pretty valid case that we are approaching the ability to make an ad-hoc supercomputer for yourself from the GCP console.
  - benstrumental 4 years ago
    
    This talk was from April 2021. 2009 is the year he got his PhD[2].
    [2] https://jeffhammond.github.io/
fennecfoxen 4 years ago

It's all about the distributed filesystems made from big arrays of fast fast disks, and the massive I/O backplane to the storage system and between nodes.
KaiserPro 4 years ago

This is a shared memory cluster. That is, there is some level of RDMA over a networking fabric.
cjbgkagh 4 years ago

I’d say mainly networking bandwidth.

pinewurst 4 years ago

Are we supposed to collectively feel, "Yay, Facebook!"?! It's a bigger tool for surveillance enablement, no different from how the CCP monitors Xinjiang cameras.

Atlas667 4 years ago

An AI that has free range to more profoundly study all the human data Facebook users generate?

That sounds wicked evil. If ads, marketing and habit inducing platform designs are a problem now, imagine what this will lead to.

To understand what drives your users more than the users understand it themselves and to use that understanding for profit. Intensified.

And not to mention for surveillance, you know DARPA and the NSA want their hands all over this.

rezonant 4 years ago

Alternative headline: "Facebook patents Skynet"

yosito 4 years ago

How can they possibly keep the location of something like this a secret? There have to be thousands of people involved in building and maintaining it.

paxys 4 years ago

Just because they aren't publishing the location in a blog post doesn't mean they are keeping it secret. It's simply not relevant info.
riffic 4 years ago

I don't see any sort of commitment to secrecy with this:
> The company declined to comment on the location of the facility or the cost
It's generally a common practice not to disclose addresses of your data centers, but they can usually be discerned with a bit of research. Journos aren't going to be that extensive.
jreese 4 years ago

Meta has a number of publicly-announced datacenter locations that were built and operated specifically by/for Meta. It's probably safe to assume it's located in one or more of those datacenters.
changoplatanero 4 years ago

won't it just be a few racks of gpus in one one the existing giant data centers?
- ssully 4 years ago
  
  Super computers are much larger than a few racks of GPU's.
  - ceejayoz 4 years ago
    
    You could fit any of the TOP500 machines into one of Facebook's datacenters, couldn't you? With room left over to spare?
    It's not like Facebook had to go hollow out the Moon to make this.
    
    jeffbee 4 years ago
    
    The largest machine in the top500 draws 30MW, which is getting to be close to the size of a Facebook or Google datacenter. All the rest are much smaller. Mostly people misunderstand the relationship between supercomputers and the cloud. Supercomputers are somewhat large and very specialized. Cloud datacenters are just enormous.
  - riantogo 4 years ago
    
    Many years back the military connected some 1700 PS3s to create the world's 35th most powerful supercomputer. That needed few racks and could do 500 tflops. One latest XBox can do 12 tflops sitting in your living room. Of course the supercomputers would also have gotten magnitudes faster since. But hope this gives some sense of physical size.

bno1 4 years ago

I wonder if things like this are the real reason behind the GPU shortage. How many other AI super computers are being built right now?

terafo 4 years ago

This is definitely not the case. A100, which is used for most "AI supercomputers" is manufactured on TSMC fabs, while Nvidia's gaming cards are produced on Samsung fabs. AMD produces their gaming GPUs on TSMC, but they are somewhere around 10% of the market since they are unwilling to divert their capacity from CPUs, which are more profitable, and consoles, really not sure why.
exdsq 4 years ago

I think these sorts of computers use special GPUs that are industrial and used specifically for AI/ML work. I don't believe they've powered the super computer with 3080s and I also don't think they use the same underlying chips either (albeit they are probably built with the same raw material that might be in short supply).
- colechristensen 4 years ago
  
  They take up chip fab capacity and that’s the bottleneck. The fact that it would be a custom die doesn’t really make a difference (and high level the features that go on the chip aren’t really all that different either, same stuff with various quantities and features tweaked)
  - terafo 4 years ago
    
    They take up chip capacity on different fab. You can't produce gaming Ampere on TSMC. They are built on different architectures that have only name in common. The difference between "Ampere" and "Ampere" is bigger than between Volta and Turing, or maybe even than difference between Pascal and Turing.
- capableweb 4 years ago
  
  Good luck building special GPUs in just two years, especially with what's happening regarding chip production right now. Not sure how they could have achieved a project of this size/scope unless they use off-the-shelf components, since the backlogs are so long and have been for some time now.
  - exdsq 4 years ago
    
    Facebook has been hiring FPGA engineers with ML experience since 2018 so I don't think this would be out of the question! But even so, Nvidia sell custom GPUs that aren't the same ones for gaming.
    
    capableweb 4 years ago
    
    They claim the work was done in just two years ("The supercomputer, the AI Research SuperCluster, was the result of nearly two years of work"). I'm not a expert in manufacturing, but I'd imagine it takes longer than two years to design > test > manufacture completely new chips.
    > But even so, Nvidia sell custom GPUs that aren't the same ones for gaming.
    Interesting. Gonna be fun to observe the outrage (from distance) about how GPUs are not only used to destroy the environment for cryptocurrency profits, but now Facebook will also contribute to the world destruction for ad-money.
snek_case 4 years ago

Yes, lots of GPUs have been purchased to be installed in compute clusters since ~2009. The deep learning boom only increased that. At least these are not being used to mine cryptocoins...
redisman 4 years ago

You can probably look up who is TSMC manufacturing custom chips for since they’re public
KaiserPro 4 years ago

naa, they have at most ~30k GPUs. most of it is lack of capacity rather that large demand

keithnz 4 years ago

So a company that specializes in targeted advertising to make money has invested in the most powerful AI supercomputer? Great.

michelb 4 years ago

All this to better predict behavior and present ads. What a waste.

sxv 4 years ago

A waste is when you throw something into a landfill. This is weaponization by an enemy of the people.

ricardobeat 4 years ago

https://archive.is/xdQtE

hetspookjee 4 years ago

So much wh will one inference cost? I mean computing has come more efficient but I still struggle to find data on how much wh is used for some sentences of GPT3, for example.

lemax 4 years ago

It feels eerie that these trillions of parameters and exabyte sized training sets will come from harvested user data. 17 years of user activity all culminating into some.. supercomputer? I wonder how comments I wrote when I, and all the other people from my generation, were like 14 and using FB will feed into this and sort of be immortalized in this strange way.

bognition 4 years ago

https://archive.md/xdQtE

sydthrowaway 4 years ago

I feel like the guy in A Canticle For Leibowitz who knew the history of the world.

perilousacts 4 years ago

Literally don't care. Facebook should not be the company with this. :/

adamnemecek 4 years ago

Wow, I hope that the surveillance state will be at last 30% more efficient.

bee_rider 4 years ago

The cool thing about improving efficiency is that you can either keep doing what you were doing, but 30% cheaper, or you can just do 30% more of it!
The best thing is, assuming the 'quality' of their product scales with the amount of work put into it, we'll get... 30% more accurate ads? Somehow they'll steal 30% of Google's lunch? Well, I don't know, but it sure looks like an incredible amount of engineering talent has been put toward getting us 30% more nothing.
- eezurr 4 years ago
  
  I think you're not considering the effective efficiency difference. This is what scares me about unreviewed (by society, government) advances in technology.
  If we increase the efficiency of something (lets say software) by 100%, all the good things that can be done with software gain a 100% efficiency. However, that does not equate to all the bad things that can be done with software gain a 100% efficiency. Many destructive actions are orders of magnitude more efficient than all things constructive (currently), so the net result is that the world gets more dangerous.
  For a more physical example, consider that a truck filled with powerful explosives could knock down a sky scraper. That is, for a handful of manhours, it is possible to undo the work of hundreds of thousands of manhours, plus the hundreds of thousands of manhours society would need to divert to managing the after effects of that disaster, and the emotional cost, etc.
  There's an underlying efficiency bonus that destructive actions have that is not being accounted for.
clows 4 years ago

or at least ads will be 2% less irrelevant.
- tikimcfee 4 years ago
  
  Nope, you’ll just get 10x as many ads with half the duration to minimize the amount of time your brain has to determine if something irrelevant or not. Those 5 second ads don’t cut short because they’re kind - it’s all they need to repeat to have the name, jingle, or sad-face burned into your neural net.
  - Permit 4 years ago
    
    Can you elaborate? Since Facebook has built a large supercomputer, we should all expect to see more ads? I don't understand why the quantity of ads would increase...
- Traubenfuchs 4 years ago
  
  Anecdote time: For the first time, my new partner spent last week at my home, using my wifi. He is a car nerd. I am now receiving car ads that are absolutely not relevant to me.
  Adtech is still a bad joke.

busymom0 4 years ago

Could this have been the reason for chip shortage?

fennecfoxen 4 years ago

Do you think 6,080 GPUs — admittedly very large ones — are sufficient to explain the chip shortage?
zydex 4 years ago

Nvidia and AMD shipped 12.7m cards collectively in 2021, I don't know what the breakdown is on consumer vs corporate but I find it hard to believe this had any impact.
Correction: terafo pointed out they shipped 12.7m cards in Q3 2021 alone.
- terafo 4 years ago
  
  It was not 12.7 million in 2021. It was 12.7 million in Q3 2021.
  - zydex 4 years ago
    
    Where did you see that? This was my source:
    https://www.digitaltrends.com/computing/gpu-shipments-increa...
    "Nvidia and AMD shipped 12.7 million cards in 2021"
    Please correct me if I'm misreading or clearly missing something.
    
    terafo 4 years ago
    
    If you look at the source article it is clearly stated.
    >Year over year, total AIB shipments increased by 25.7% this quarter compared to last year at 12.7 million units, and up quarter-to-quarter from 11.47 million units in Q2’21.
    
    zydex 4 years ago
    
    Understood, thank you for clarifying.
terafo 4 years ago

https://news.ycombinator.com/item?id=30064173
hmate9 4 years ago

No, this is way too small in scale vs the global demand.

penjelly 4 years ago

it feels like the future of companies is to increasingly give tasks to AI. So eventually we'll have massive corps that have a couple execs and a super ai making them all obscenely rich?

I want to hate this idea, but it would be the same as hating machines replacing manual labor over the last 100 years.

im not sure what to think, nor how to prepare myself for the next 20 years.

paxys 4 years ago

20 years is optimistic. This future isn't something we'll have to worry about in our lifetime, if ever. People wildly overestimate the state of AI as it exists today.
edgyquant 4 years ago

Not likely. Automation only increases productivity and companies are always looking to expand. The only thing automating jobs does it create new ones for people to work.

Settings

Meta Unveils New AI Supercomputer

Keyboard Shortcuts