What we’ve learned about the robot apocalypse from the OpenAI debacle

The News

For the people who worry that artificial intelligence could destroy the world, known as “rationalists,” the last few days have been confusing and alarming. The board of the highest-profile AI company, OpenAI, fired its CEO Sam Altman for reasons thought to be related to their worries about the risk of AI. That promptly backfired in incredibly confusing fashion — one of those board members signed a letter condemning his own actions and threatening to quit. Now, Altman has apparently been rehired.

In this article:

The News

Know More

Tom’s view

Room for Disagreement

Notable

Know More

Since the earliest days of AI, researchers have raised the possibility that a really powerful intelligence could be dangerous. I. J. Good, a colleague of Alan Turing’s and an early AI pioneer, warned in 1965 of an “intelligence explosion” and said that “the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.”

Since the late 1990s those worries have become more specific, and coalesced around Nick Bostrom’s 2014 book Superintelligence: Paths, Dangers, Strategies and Eliezer Yudkowsky’s blog LessWrong. The argument is not that AI will become conscious or that it will decide it hates humanity. Instead, it is that AI will become extraordinarily competent, but that when you give it a task, it will fulfill exactly that task. Just as when we tell schools that they will be judged on the number of children who get a certain grade and teachers start teaching to the test, the AI will optimize the metric we tell it to optimize. If we are dealing with something vastly more powerful than human minds, the argument goes, that could have very bad consequences.

In 2015, some of AI’s leading minds gathered in Puerto Rico, along with figures including one Elon Musk. Musk became so alarmed by the prospect of AI that he helped set up OpenAI, with the explicit intention of reducing the risk that it would destroy humanity. “With artificial intelligence, we are summoning the demon,” he said.

But the people who worried about AI saw Musk’s actual plan for OpenAI — making AI open-source — as unhelpful at best. Yudkowsky described Musk’s position as “AGI is summoning a demon, so let’s make sure everyone has one.”

Over time, OpenAI changed its view — in 2018, Musk was sidelined by the other co-founders, including Altman, and the open-source policy was dropped. But while Altman has always said that he takes AI risk seriously — he has argued that A.I. poses an existential risk to humanity, and at one point his Twitter bio read “eliezer yudkowsky fan fiction account” — rationalists I’ve spoken to worried he was mainly paying it lip service: He did not, one told me, ever actually speak to Yudkowsky. (It is worth noting that many outside the “rationalsphere” and some within it would say that’s an entirely sensible decision.)

Rumors are swirling about exactly why the board — stacked with serious believers in AI risk — fired Altman, But what’s pretty concrete is that the whole debacle has made it easy to mock the rationalists and effective altruists who worry about this sort of thing. Zvi Mowshowitz, a prominent rationalist, said on his blog: ”[Skeptics are saying] that this will completely discredit EA or ‘doomerism’ or any concerns over the safety of AI, forever. Yes, they say this every week, but this time it was several orders of magnitude louder and more credible.”

Tom’s view

It’s very easy to mock concerns that AI will kill everyone, but I’ve spoken to enough researchers who think it’s plausible. The founders of OpenAI as well as other major AI organizations such as DeepMind, Inflection, and Anthropic are all on the record saying it’s plausible, as are Geoffrey Hinton and Yoshua Bengio, two of the most senior and well-respected AI scientists.

That doesn’t mean it’s going to happen. (One person at DeepMind several years ago said something to me along the lines of: “Yes, AI risk is real, but we’re going to solve it and make AI awesome.”) But it’s not crazy sci-fi or implausible nonsense.

For now, AI management seems to have encountered another ruthlessly optimizing system. Capitalism has some parallels to the purported risks of AI, in that companies optimize for a certain metric — shareholder value — whether or not that metric is perfectly aligned with the things we actually care about: Human flourishing or moral goodness or whatever. The OpenAI governance structure, with its nonprofit board empowered to remove the CEO if he or she was following shareholder value rather than the benefits of humanity, was designed to circumvent that problem. Ultimately, pressure from investors overcame those guardrails.

Room for Disagreement

Plenty of people do think that AI risk is crazy sci-fi or implausible nonsense. Yann LeCun, currently at Meta and who alongside Hinton and Bengio is described as one of the “Godfathers of AI,” said: “Will AI take over the world? No, this is a projection of human nature on machines.”

Notable

The original LessWrong posts on AI risk have been made into an e-book, Rationality: From AI to Zombies. The whole edited version is also available here. It is roughly twice as long as The Lord of the Rings, but worth the time.
The researcher Katja Grace surveyed AI scientists to ask when they thought superintelligent AI would be likely to arrive, and how likely it would be to kill everyone: The answers can be summed up as “quite soon” and “quite likely” — more specifically, the median answer was that it would probably be here by 2059, and that there was a 5% chance of human extinction.
If you want to understand the idea of optimizing for something and how that can go wrong in many more ways than simply AI or capitalism, the long, strange, but mind-expanding 2014 essay “Meditations on Moloch” by Scott Alexander is a fascinating place to start.
Nate Soares, the executive director of the Machine Intelligence Research Institute, gave a talk at Google in 2017 on why AI is dangerous, why we should think of it more like Disney’s Fantasia than The Terminator, and issues a warning to programmers: “If nothing yet has struck fear into your heart, I suggest meditating on the fact that the future of our civilization may well depend on our ability to write code that works correctly on the first deploy.” The video and transcript are here.