Anthropic released Claude Fable 5 to the public on Tuesday, its first Mythos-class model deemed safe enough for general use. Hours later, the AI community was in open revolt.
Buried in the model's 319-page system card, Fortune's Sharon Goldman reported, lies a detail Anthropic did not highlight: Fable 5 silently downgrades its responses when it detects requests related to cutting-edge AI development work, including building the infrastructure used to train large models. The model still responds, but uses "interventions to limit Claude's effectiveness" without telling the user.
This is different from Fable 5's other restrictions. When the model blocks cybersecurity or biology queries, it visibly redirects users to the less capable Claude Opus 4.8 with a notification.
The AI research restriction is invisible. The system card states this explicitly: "not visible to the user."
Anthropic estimated the restriction affects roughly 0.03% of traffic and defended the approach by saying "enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms." The AI research community saw it differently. Nathan Lambert, an open model researcher who most recently led work at AI2, called the practice "appalling."
"To have my access to the cutting edge models for my work rug pulled in an under the table fashion is appalling," Lambert wrote. "To me this paints Anthropic clearly as anti-science, and therefore anti-progress and anti-safety."
Dean Ball, a senior fellow at the Foundation for American Innovation and former White House Office of Science and Technology Policy advisor, coined a term for it: "secret sabotage." He wrote that the policy "massively and profoundly raises the status of the argument that AI safety has been hype to justify monopolistic behavior by labs."
Jeremy Howard, head of nonprofit research group Fast AI, pointed to the asymmetry. Anthropic keeps full Fable 5 capabilities for its own researchers while throttling external researchers.
"They've said they'll sabotage others who try," Howard wrote. "This means the AI frontier advances, and power imbalance increases."
Even former Anthropic employees joined the criticism. Behnam Neyshabur, who previously co-led Anthropic's effort to develop an AI scientist, posted on X: "Working on AI for cancer?
Sorry, I can't help you. Working on AI for Alzheimer's Disease?
Sorry, I'm becoming a bit dumb when it comes to the AI part of it."
Not everyone piled on. Ethan Mollick, a Wharton associate professor, wrote that Fable 5 "outperformed basically every other public model I have used by a considerable margin." Andrej Karpathy, who joined Anthropic last month, called it a "super exciting release" and a "major-version-bump-deserving step change forward," though he noted safeguards are "configured to be a little too trigger-happy for launch."
The controversy lands at an awkward moment for Anthropic. The company confidentially filed for IPO paperwork just over a week ago.
Dianne Na Penn, Anthropic's head of product management, research, and labs, told Fortune the company is "raising the bar on the intelligence of the models" while "pushing the frontier in a safe manner." She acknowledged some benign requests would initially be blocked and said Anthropic is working on post-launch safeguard improvements.
Fable 5 and the restricted Mythos 5 are priced at $10 per million input tokens and $50 per million output tokens, available immediately via the Claude API. Early data shows 95% of sessions run on Fable 5's full capabilities without triggering fallbacks.
Independent red-teaming spanning over 1,000 hours found no universal jailbreaks. But the invisible downgrade for AI research queries has opened a new front in the debate over safety versus control. The question is no longer just whether Anthropic can contain Mythos-class risks.
It's whether the company's definition of "safe" also means keeping competitors and researchers at a distance.













