Why AI Safety Officials Keep Quitting Their Jobs

Why AI Safety Officials Keep Quitting Their Jobs

Why AI Safety Officials Keep Quitting Their Jobs Something's broken in the AI industry's safety apparatus, and it's not the technology

Sep 20, 2025
10 min read

Don't Miss the Good Stuff

Get tech news that matters delivered weekly. Join 50,000+ readers.

No spam
Unsubscribe anytime
Weekly digest

Something's broken in the AI industry's safety apparatus, and it's not the technology - it's the humans running the show.

The exodus from OpenAI reads like a who's who of AI safety leadership. Ilya Sutskever, Jan Leike, Steven Adler, Miles Brundage, Daniel Kokotajlo, Leopold Aschenbrenner, Pavel Izmailov, Cullen O'Keefe, William Saunders. These aren't random departures from random teams; these are the people specifically tasked with making sure artificial intelligence doesn't accidentally destroy civilization. And they're all walking away.

The pattern extends beyond OpenAI, too. Geoffrey Hinton, widely considered the "godfather of AI," quit Google in May 2023 with a stark warning: "Right now, they're not more intelligent than us, as far as I can tell. But I think they soon may be." When the person who helped build the foundation of modern AI is terrified by its pace, that's not exactly a confidence booster.

The Trust Collapse

According to multiple former employees and observers familiar with OpenAI's internal dynamics, safety-minded employees have systematically lost faith in CEO Sam Altman's leadership. The pattern suggests a process of trust eroding incrementally, with each incident adding to mounting concerns. The breaking point came after Altman's dramatic November 2023 firing and subsequent power grab, where he threatened to take OpenAI's talent to Microsoft unless the board reinstated him.

That move revealed something crucial about Altman's character: when faced with oversight, his response wasn't to address concerns but to eliminate the overseers entirely. He returned with a friendlier board and fewer checks on his authority. For safety researchers already worried about OpenAI's direction, this was a clear signal that corporate priorities would always trump safety considerations.

Jan Leike, former co-leader of OpenAI's superalignment team, didn't mince words when he resigned: "I have been disagreeing with OpenAI leadership about the company's core priorities for quite some time, until we finally reached a breaking point." His departure thread on X painted a picture of a safety team "sailing against the wind," struggling for computing resources while the company raced toward commercialization.

The institutional pressure is real. OpenAI's infamous non-disparagement agreements essentially buy silence from departing employees - refuse to sign, and you forfeit potentially millions in equity. Only a few, like Daniel Kokotajlo, have been willing to sacrifice their financial stakes to speak freely. "I gradually lost trust in OpenAI leadership and their ability to responsibly handle AGI, so I quit," Kokotajlo explained.

The Impossible Job

Here's the thing safety officials are grappling with: they're being asked to solve problems that barely have technical solutions yet. As White House tech adviser Arati Prabhakar bluntly admitted, the technology for assessing AI safety "barely exists." Current AI models can already do simple reasoning and hold more general knowledge than any human, but determining whether they'll generate cyberattacks or help build bioweapons "is presently not fully in our grasp."

Steven Adler, another OpenAI safety researcher who recently went public about his departure, captured the existential weight of the role: "Even if a lab truly wants to develop AGI responsibly, others can still cut corners to catch up, maybe disastrously. And this pushes all to speed up." No lab has a "solution to AI alignment today," yet the race continues at breakneck pace.

This creates an impossible dynamic for safety officials. They're working on humanity's most important technical challenge while their employers prioritize shipping products and staying competitive. Geoffrey Hinton highlighted the zero-sum nature of the problem: "Even if everybody in the US stopped developing it, China would just get a big lead."

The safety researchers aren't just worried about theoretical future risks, either. Chinese scientists at Fudan University published preliminary research in December 2024 suggesting that AI models could self-replicate and show survival instincts when facing shutdown - behaviors that weren't explicitly programmed. While this study hasn't been peer-reviewed and remains controversial, the possibility that AI systems might develop their own sub-goals around self-preservation has safety researchers losing sleep for good reason. If confirmed, such behaviors would represent a significant milestone in AI development that we're not prepared to handle.

The Human Factor

What's particularly striking about this wave of departures is how consistently they point to human failures rather than technical ones. These aren't researchers fleeing because they saw some horrifying technological breakthrough (despite viral "What did Ilya see?" memes). They're leaving because they've lost faith in the humans making decisions about that technology.

Multiple sources describe a pattern where companies say they value safety but consistently prioritize speed and profit margins. Altman's reported fundraising with autocratic regimes like Saudi Arabia for AI chip manufacturing exemplifies this disconnect - if you truly care about safe AI deployment, why accelerate development by working with governments that might use AI for surveillance and human rights abuses?

For safety officials, this represents a fundamental betrayal. They joined companies believing in missions to build beneficial AI, only to watch those companies optimize for market dominance instead. The technical challenges are hard enough without having to fight internal political battles over resource allocation and strategic priorities.

China's growing focus on AI safety - including calls for "oversight systems to ensure the safety of artificial intelligence" in a July 2024 Chinese Communist Party policy document - suggests that even geopolitical competitors recognize the stakes involved. When authoritarian governments are publicly acknowledging AI safety concerns, it underscores how seriously the technical community takes these risks.

Another layer to this story is where top minds are heading after leaving U.S. labs. Song-Chun Zhu, a pioneering AI scientist once at UCLA and Harvard, stunned colleagues in 2020 by moving to China, where he now runs the Beijing Institute for General Artificial Intelligence with state backing. Zhu openly rejects the Silicon Valley belief that scaling large neural networks will lead to general intelligence. Instead, he argues that “small data, big task” reasoning better captures what real intelligence looks like. His departure underscores how talent flight isn’t just about disillusionment with leadership, it’s also reshaping the global balance of AI research.

What Comes Next

This talent drain is happening while governments are hardening their positions. The Trump administration has promised a $90 billion AI hub in Pennsylvania to secure American dominance, while Beijing is fusing AI into everything from elder care to defense. In this race, researchers themselves, where they choose to work and what philosophical approach they back, are becoming as consequential as the models they build.

See also - Everyone's Counting the Billions, But the Real US-UK AI Deal Is About Who Sets the Rules

With OpenAI's superalignment team gutted and many of the field's leading safety researchers scattered to the wind, the immediate future looks precarious. The company has redistributed safety responsibilities across various teams, but the dedicated focus on existential risks from future AI systems - the "whole point" of the superalignment team, according to insiders - has essentially evaporated.

This leaves the AI industry in a dangerous position: racing toward artificial general intelligence without adequate safety guardrails, led by companies that have proven they'll sacrifice oversight for competitive advantage. As Jan Leike warned, "I believe much more of our bandwidth should be spent getting ready for the next generations of models." Instead, the bandwidth is being spent on product launches and market positioning.

The safety officials who've quit aren't giving up on the mission - they're giving up on trying to accomplish it within corporate structures that systematically undermine their work. Some, like Ilya Sutskever, are pursuing "projects that are very personally meaningful" outside the constraints of commercial AI labs. Others are working on technical safety research at academic institutions or independent organizations.

Zhu’s case is telling. His lab in Beijing recently unveiled TongTong, a virtual child-like AI agent designed to show commonsense reasoning that large language models still lack. Whether or not his approach succeeds, the fact that one of America’s most celebrated AI professors felt he had “no choice” but to leave the U.S. for China signals how fragile the Western hold on AI leadership has become.

But here's the uncomfortable truth: the companies building the most powerful AI systems have now driven away many of their most safety-conscious employees. The people best positioned to solve AI alignment problems are no longer working where those problems are most urgent. That's not a technical failure - it's a human one, and it might be the most dangerous kind.

As Steven Adler, who left OpenAI in November 2024, put it with characteristic understatement: 

"I'm pretty terrified by the pace of AI development these days." 

When the experts responsible for keeping AI safe are terrified of the very systems they helped build, maybe the rest of us should be paying closer attention.

If you enjoyed this guide, follow us for more.

Share this article

Help others discover this content