OpenAI's Long-Term AI Risk Team Has Disbanded

100 points by robbiet480 2 years ago · 58 comments

Reader

mgdev 2 years ago

Yes, it's valuable to have a small research team who focuses on R&D outside the production loop.

But when you give them a larger remit, and structure teams with some owning "value" and and others essentially owning "risk", the risk teams tend to attract navel-gazers and/or coasters. They wield their authority like a whip without regard for business value.

The problem is the incentives tend to be totally misaligned. Instead the team that ships the "value" also needs to own their own risk management - metrics and counter metrics - with management holding them accountable for striking the balance.

Hasu 2 years ago

The purpose of the risk team at OpenAI was to prevent the destruction of humanity.
I think you definitely want people who have that responsibility to "wield their authority like a whip without regard for business value".
Now, whether you buy OpenAI's hype about the potential danger (and value) of their products, that's up to you, but when the company says, "We're getting rid of the team that makes sure we don't kill everyone", there is a message being sent. Whether it's "We don't really think our technology is that dangerous (and therefore valuable)" or "We don't really care if we accidentally kill everyone", it's not a good message.
- squarefoot 2 years ago
  
  > but when the company says, "We're getting rid of the team that makes sure we don't kill everyone", there is a message being sent
  Hard not to imagine a pattern if one considers what they did a few months ago:
  https://www.cnbc.com/2024/01/16/openai-quietly-removes-ban-o...
  - mnky9800n 2 years ago
    
    Maybe the message is, these ai ain't going to take over the world.
- mgdev 2 years ago
  
  > The purpose of the risk team at OpenAI was to prevent the destruction of humanity.
  Yeah, the problem (in this outsider's opinion) is that that charter is so ill-defined that it's practically useless, which in turn means that any sufficiently loud voice can apply it to anything. It's practically begging to be used as a whip.
  > I think you definitely want people who have that responsibility to "wield their authority like a whip without regard for business value".
  No, because without thinking of value, there is no enterprise, and then your mission is impossible. Essentially, it creates an incentive where the best outcome is to destroy the company. And, hey hey, that's kinda what almost happened.
  > Whether it's "We don't really think our technology is that dangerous (and therefore valuable)" or "We don't really care if we accidentally kill everyone", it's not a good message.
  I don't think it has to be so black-and-white as this. Meta, Microsoft, and Google did the same thing. Instead, those functions have been integrated more closely into the value teams. And I can't imagine Amazon or Oracle ever having those teams in the first place. They likely all realized the same thing: those teams add huge drag without adding measurable business value.
  And yes, there are ways to measure the business value of risk management, and weigh against upside value to decide the correct course of actions - it's just that most of those teams in big tech don't actually take a formal "risk management" approach. Instead they pontificate or copy and enforce.
davidivadavid 2 years ago

So what you're saying is OpenAI can't align two teams internally but they want to align a super intelligence.
- mgdev 2 years ago
  
  I think "aligning super intelligence" is a nothingburger of a goal, for exactly that reason. It's not a problem that's unique to OpenAI.
  The reason you can't "align" AI is because we, as humans on the planet, aren't universally aligned on what "aligned" means.
  At best you can align to a particular group of people (a company, a town, a state, a country). But "global alignment" in almost any context just devolves into war or authoritarianism (virtual or actual).
encoderer 2 years ago

Here’s the former team leaders take. Says they couldn’t get compute resources to do their job.
https://x.com/janleike/status/1791498174659715494
- petre 2 years ago
  
  Well, Altman did try to get potential investors to shell out $7T for "AI chips".
6gvONxR4sf7o 2 years ago

Speaking of incentives, the way your suggestion tends to work out is that you make a team own both value and risk, then incentivize them only for value, which works out predictably.
23B1 2 years ago

No, this just puts the fox in the henhouse. Systems of checks-and-balances between independent entities exist for a reason.
Without them internally, it'll just fall to regulators, which of course is what shareholders want; to privatize upside and socialize downside.
- mgdev 2 years ago
  
  Agree that you need checks and balances, but there are better and worse systems.
  As someone who has scaled orgs from tens to thousands of engineers, I can tell you: you need value teams to own their own risk.
  A small, central R&D team may work with management to set the bar, but they can't be responsible for mitigating the risk on the ground - and they shouldn't be led to believe that that is their job. It never works, and creates bad team dynamics. Either the central team goes too far, or they feel ignored. (See: security, compliance.)
germinator 2 years ago

In the general case, I mostly agree, but it cracks me up that this is the prevailing attitude when it comes to our industry; but when we see police departments or government agencies trying to follow the same playbook, we immediately point out how that's laughable and doesn't result in real accountability.
In this specific case, though, Sam Altman's narrative is that they created an existential risk to humanity and that the access to it needs to be restricted for others. So which is it?
- nprateem 2 years ago
  
  Neither. It was pure hype to get free column inches.
  Anyone who's used their AI and discovered how it ignores instructions and makes things up isn't going to honestly believe it poses an existential threat any time soon. Now they're big enough they can end that charade.
  - saulpw 2 years ago
    
    Ignoring instructions and making things up may very well be an existential threat. Just not of the SkyNet variety.
    
    nprateem 2 years ago
    
    Of course, but stupidity and lying always potentially is. It doesn't need to be technical. That's a whole different conversation.
seanhunter 2 years ago

Completely agree with this. Everyone doing AI has to do AI that is both valuable and responsible. You can't have an org structure where the valuable AI and responsible AI teams are in some kind of war against each other. It will just never work well.[1]
Imagine if vehicle manufacturers[2] split their design and R&D teams into a "make the thing go" team and a "don't kill the passengers" team. Literally noone would think that arrangement made sense.
I can totally see when we are at a state of significant maturity of both AI and AI regulation that you have a special part of your legal team that are specialised in legal/regulatory compliance issues around AI just like companies tend to have specialised data privacy compliance experts. But we're not there yet.
[1] If you're serious about long-term AI risk and alignment research, sponsor some independent academic research that gets published. That way it's arms-length and genuinely credible.
[2] If you like you can maybe mentally exclude Boeing in this.
- doktrin 2 years ago
  
  1. Audit / evaluation / quality assurance teams exist across multiple verticals from multinationals to government, and cannot reliably function when overly subservient to the production or “value creating” side
  2. Boeing is a good and timely example of the consequences of said internal checks and balances collapsing under “value creation” pressure. That was a catastrophic failure which still can’t reasonably be compared to the downside of misaligned AI.
  - seanhunter 2 years ago
    
    I agree with you on both points, but they have QA which is 1. The long-term risk team was more of a research/futurology/navel-gazing entity rather than a qa/audit function. I would say if you have any possible safety/alignment test that you can feasibly run it should be part of the CI/CD pipline and be run during training also. That's not what that group was doing.
    
    doktrin 2 years ago
    
    That’s quite the narrow goalpost you’ve set up. What happens if a problem can’t be expressed as a Jenkins pipeline operation?
- mgdev 2 years ago
  
  > Imagine if vehicle manufacturers[2] split their design and R&D teams into a "make the thing go" team and a "don't kill the passengers" team.
  Rocket caskets. Can't kill someone who is already dead!
reducesuffering 2 years ago

Did you read Jan Leike's resignation? https://x.com/janleike/status/1791498174659715494
I hope others see that there are two extremely intelligent sides, but one has mega $$ to earn and the other is pleading that there are dangers ahead and not to follow the money and fame.
This is climate change and oil companies all over again, and just like then and now, oil companies are winning.
Fundamentally, many people are the first stage, denial. Staring down our current trajectory of AGI is one of the darkest realities to imagine and that is not pleasant to grapple with.
- swatcoder 2 years ago
  
  The alternate take is that the ceiling is proving to be lower than originally hoped, and that one team of startup folk are looking to squeeze out a business model that satisfies their original investors and the other team of passionate researchers are ready to go find their next bold research opportunities elsewhere.
  - reducesuffering 2 years ago
    
    > passionate researchers are ready to go find their next bold research opportunities elsewhere
    Not supported by Ilya agreeing with board to fire Sam Altman.
    I also think you'll struggle to find a majority of people thinking AI research's "ceiling is proving to be lower than originally hoped", what with 4o, SORA, GPT5 all coming and it's only been 1.5 years since ChatGPT.
  - mgdev 2 years ago
    
    I like this take best. :)
    Most folks who are part of the startup, don't care to stick around for the scale. Different phases of the company tend to attract different types of people.
- mgdev 2 years ago
  
  I hadn't until now, but it's the other failure mode I mentioned in another fork [1] of this thread:
  > A small, central R&D team may work with management to set the bar, but they can't be responsible for mitigating the risk on the ground - and they shouldn't be led to believe that that is their job. It never works, and creates bad team dynamics. Either the central team goes too far, or they feel ignored. (See: security, compliance.)
  [1]: https://news.ycombinator.com/item?id=40391283

23B1 2 years ago

Think of it like the industrial revolution. No environmentalist, shouting for analysis, regulation, or transparency would have survived that era, they'd've been steamrolled. Now we're left with many long-term problems, even generations downstream of that focus on profit above all else.

Same thing happening now.

And you don't have to be a doomer screeching about skynet. The web is already piling up with pollutive, procedurally-generated smog.

I'm not catastrophizing; its just that history is the best predictor of the future.

jaggs 2 years ago

At a guess I would say there are many competing imperatives for OpenAI

1. Stay just a tiny bit ahead of rivals. It's clear that OpenAI have much much more in the bag than the stuff they're showing. I'm guessing that DARPA/Washington has got them on a pretty tight leash.

2. Drip feed advances to avoid freaking people out. Again while not allowing rivals to upstage them.

3. Try to build a business without hobbling it with ethical considerations (ethics generally don't work well alongside rampant profit goals)

4. Look for opportunities to dominate, before the moat is seriously threatened by open source options like Llama. Meta has already suggested that in 2 months they'll be close to an open source alternative to GPT4o.

5. Hope that whatever alignment structures they've installed hold in place under public stress.

Horrible place to be as a pioneer in a sector which is moving at the speed of light.

We're on a runaway Moloch train, just gotta hang on!

blovescoffee 2 years ago

What could DARPA/Washington do and why would “they” know what goes on behind the scenes at OpenAI?
- jaggs 2 years ago
  
  Sam Altman already mentioned having to check with Washington regarding release processes. It's one of his tweets. DARPA is defense as you know. It's pretty obvious that the military (and governments) will be heavily engaged in AI development at all levels.

lordmauve 2 years ago

Good. As someone who is a paid up OpenAI user I absolutely don't agree that there should be a role for a team screaming to put the brakes on because of some nebulous, imagined "existential risk" of hypothetical future AGI.

There are huge risks to AI today in terms of upheaval to economies and harms to individuals and minorities but they need to be tackled by carefully designed legislation, focused on real harms, like the EU AI legislation.

Then that imposes very specific obligations that every AI product must meet.

It's both better targeted, has wider impact across the industry, and probably allows moving faster in terms of tech.

croes 2 years ago

Bad As someone who is a paid up OpenAI user I absolutely agree that there should be a role to put the brakes on because some value profit over risks.
Two years ago, you wouldn't have believed it if someone had promised results like we have now. AGI can appear suddenly, or even decades later. But if nobody pays attention to it, we will definitely notice it too late if it happens.

39896880 2 years ago

It appears that sama and co said whatever they needed to say to investors to convince them they cared about the actual future, so now it’s time to switch to quarterly profit goals. That was fast. Next up:

* Stealth ads in the model output

* Sale of user data to databrokers

* Injection into otherwise useful apps to juice the usage numbers.

joaogui1 2 years ago

The ads are definitely coming given their pitch deck for the data partnerships https://www.adweek.com/media/openai-preferred-publisher-prog...

_wire_ 2 years ago

Ridiculous. The board can't even regulate itself in the immediate moment, so who cares if they're not trying to regulate "long term risk". The article is trafficking in nonsense.

"The company’s trajectory has been nothing short of miraculous, and I’m confident that OpenAI will build AGI..."

More nonsense.

"...that's safe and beneficial."

Go on...

"Two researchers on the team, Leopold Aschenbrenner and Pavel Izmailov, were dismissed for leaking company secrets..."

The firm is obviously out of control according to first principles, so any claim of responsibility in context is moot.

When management are openly this screwed up in their internal governance, there's no reason to believe anything else they say about their intentions. The disbanding of the "superalignment" team is a simple public admission the firm has no idea what they are doing.

As to the hype-mongering of the article, replace the string "AGI" everywhere it appears with "sentient-nuclear-bomb": how would you feel about this article?

You might want to see the bomb!

But all you'll find is a chatbot.

—

Bomb#20: You are false data.

Sgt. Pinback: Hmmm?

Bomb#20: Therefore I shall ignore you.

Sgt. Pinback: Hello... bomb?

Bomb#20: False data can act only as a distraction. Therefore, I shall refuse to perceive.

Sgt. Pinback: Hey, bomb?

Bomb#20: The only thing that exists is myself.

Sgt. Pinback: Snap out of it, bomb.

Bomb#20: In the beginning, there was darkness. And the darkness was without form, and void.

Boiler: What the hell is he talking about?

Bomb#20: And in addition to the darkness there was also me. And I moved upon the face of the darkness.

tim333 2 years ago

Dunno about the "will build AGI" bit being nonsense. Ilya knows more about this stuff than most people.
- _wire_ 2 years ago
  
  // COMPUTING MACHINERY AND INTELLIGENCE By A. M. Turing 1. The Imitation Game I propose to consider the question, "Can machines think?" This should begin with definitions of the meaning of the terms "machine" and "think." The definitions might be framed so as to reflect so far as possible the normal use of the words, but this attitude is dangerous, If the meaning of the words "machine" and "think" are to be found by examining how they are commonly used it is difficult to escape the conclusion that the meaning and the answer to the question, "Can machines think?" is to be sought in a statistical survey such as a Gallup poll. //
  AFAIK no consensus on what it means to think has developed past Turing's above point, and the "Imitation Game," a.k.a "Turing Test," which was Turing's throwing up his hands at the idea of thinking machines, is today's de facto standard for machine intelligence.
  IOW a machine thinks if you think it does.
  And by this definition the Turing Test test was passed by Weizenbaum's "Eliza" chatbot in the mid 60s.
  Modern chatbots have been refined a lot since, and can accommodate far more sophisticated forms of interrogation, but their limits are still overwhelming if not obvious to the uninitiated.
  A crucial next measure of an AGI must be attended by the realization that it's unethical to delete it, or maybe even reset it, or turn it off. We are completely unprepared for such an eventuality, so recourse to pragmatism will demand that no transformer technology can be defined as intelligent in any human sense. It will always regarded as a simulation or robot.
  - tim333 2 years ago
    
    It can be tricky to discuss AGI in any precise way because everyone seems to have their own definition of it and ideas about it. I mean ChatGPT is already intelligent in that if can do university exam type questions better than the average human and general in that it can have a go at most things. And we still seem fine about turning it off - I think you are overestimating the ethics of a species that exploits or eats most other species.
    For me the interesting test of computer intelligence would be if it can replace us in the sense that at the moment if all humans disappeared ChatGPT and the like would stop because there would be no electricity but at some point maybe intelligent robots will be able to do that stuff and go on without us. That's kind of what I think of as AGI rather than the Turing stuff. I guess you could call it the computers don't need us point. I'm not sure how far in the future it is. A decade or two?
    
    _wire_ 2 years ago
    
    > It can be tricky to discuss AGI in any precise way because everyone seems to have their own definition of it and ideas about it.
    You have just echoed Turing from his seminal paper without adding any further observation.
    > ...you are overestimating the ethics of a species that exploits or eats most other species...
    Life would not exist without consumption of other life. Ethicists are untroubled by this. But they are troubled by conduct towards people and animals.
    If the conventional measure of an artificial intelligence is in terms of a general inability to tell computers and humans apart, then ethics enters the field at such point: Once you can't tell, ethically you are required to extend the same protections to the AI construct as offered to a person.
    To clarify my previous point: pragmatic orientation towards AI technology will enforce a distinction through definition: the Turing test will become re-interpreted as the measure by which machines are reliably distinguished from people, not a measure of whence they surpass people.
    To rephrase the central point of Turing's thought experiment: the question of whether machines think is meaningless so as to not merit further discussion because we lack sufficient formal definition of "machine" and "thought."
    > ...the computers don't need us point...
    I see no reason to expect this at any point, ever. Whatever you are implying as "computers" and "us" with your conjecture of "need" is so detached from today's notions of life is also meaningless. Putting a timeframe on the meaningless is pointless.
    > ...go on without us...
    This is a loopy conjecture about a new form of life which emerges-from-and-transcends humanity, presumably to the ultimate point of obviating humanity. Ok, so "imagine a world without humanity." Sure, who's doing the imagining? It's absurd.
    Turing's point was we lack the vocabulary to discuss these matters, so he offered an approximation with a overtly stated expectation that by about this time (50 or so years from the time of his paper) technology for simulating thought would be sufficiently advanced as to demand a new vocabulary. And here we are.
    If you contribution is merely recapitulation of Turing's precept's from decades ago, you're a bit late to the imitation game.

sklargh 2 years ago

I believe the right analytical lens for this situation is - “You come at the king, you best not miss.”

Omar, portrayed by Michael K. Williams written by Ed Burns and David Chase.

dole 2 years ago

"When you shoot at the king and miss, things tend to get awkward."
https://www.vox.com/future-perfect/2024/5/17/24158403/openai...
evanelias 2 years ago

> Omar, portrayed by Michael K. Williams written by Ed Burns and David Chase.
I believe you mean David Simon, not David Chase.

jvanderbot 2 years ago

Honestly, having a "Long term AI risk" team is a great idea for an early stage startup claiming to build General AI. It looks like they are taking the mission and risks seriously.

But for a product-focused LLM shop trying to infuse into everything, it makes sense to tone down the hype.

nprateem 2 years ago

It makes it look like the tech is so rad it's dangerous. Total bollocks, but great marketing.
- reducesuffering 2 years ago
  
  Ilya and Jan Leike[0] resigned (were fired) because they believed their jobs were a temporary marketing expense? Or maybe you think you understand the risks of AGI better than them, the creators of the frontier models?
  Do you think this is a coherent world view? Compared to the other one staring you in the face? I'll leave it to the reader whether they want to believe this conspiratorial take in line with profit-motive instead of the scientists saying:
  “Currently, we don't have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue.”
  [0] https://scholar.google.co.uk/citations?user=beiWcokAAAAJ&hl=...
  - jvanderbot 2 years ago
    
    They resigned (or were fired) because the business no longer needs their unit, which puts a damper on their impact and usefulness. It also makes them a cost center in a business that is striving to become profitable.
    That is the simplest explanation, it's a tale as old as time. And is fundamentally explained by a very plausible pivot from "World changing general purpose AI - believe me it's real" to "world changing LLM integration and innovation shop".
  - tim333 2 years ago
    
    >“Currently, we don't have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue.”
    We could always stop paying for the servers, or their electricity.
    I think we'll have AGI soon but it won't be that much threat to the world.
    
    reducesuffering 2 years ago
    
    > We could always stop paying for the servers, or their electricity.
    This is satire, right? No one saying this or "off button" has thought this difficult problem through longer than 30 minutes.
    https://youtu.be/_8q9bjNHeSo?si=a7PAHtiuDIAL2uQD&t=4817
    "Can we just turn it off?"
    "It has thought of that. It will not give you a sign that makes you want to turn it off before it is too late to do that."
  - nprateem 2 years ago
    
    People can believe whatever they like. It doesn't make them right.
    The flaw is in your quote: there is no "super-intelligent AI". We don't have AGI, and given they were coming out with this a few years ago (GPT2?) it's laughable.
    They're getting way ahead of themselves.
    
    reducesuffering 2 years ago
    
    "We don't have AGI"
    We don't have 2 degrees Celsius warming either. Should we do nothing to change course or prepare? Any thinker worth their salt knows you need to plan ahead not react to things as they come and leave to chance that you then may not be able to.
    
    nprateem 2 years ago
    
    Exactly. Which is why this shows they don't genuinely believe their own hype and fear-mongering. This says more than anything that people at the cutting edge don't really think AGI is on the horizon.
    
    SpicyLemonZest 2 years ago
    
    They're heavily incentivized not to! Exxon executives in the 80s disbelieved in climate change, too, despite reports from their internal "safety" teams that it was going to be a big problem.
    
    nprateem 2 years ago
    
    The company that develops real AGI will become a trillion dollar company over night (and destroy all the competition). They are massively live-or-die incentivised.

ChrisArchitect 2 years ago

Jan Leike's OpenAI departure statement

https://news.ycombinator.com/item?id=40391412

goeiedaggoeie 2 years ago

Basically they were employed as futurologists.

Perhaps the mental model I should use is academic security researchers. Did they publish?

reducesuffering 2 years ago

https://scholar.google.co.uk/citations?user=beiWcokAAAAJ&hl=...

Settings

OpenAI's Long-Term AI Risk Team Has Disbanded

Keyboard Shortcuts