Anthropic Releases Claude Fable 5 as Its Most Capable Public AI Model with Safety Guardrails

Anthropic releases Claude Fable 5, its most capable public AI, with safety guardrails that refuse high-risk queries and reroute them to a less powerful model.

Jun 9, 2026
5 min read
Technobezz
Anthropic Releases Claude Fable 5 as Its Most Capable Public AI Model with Safety Guardrails

Don't Miss the Good Stuff

Get tech news that matters delivered weekly. Join 50,000+ readers.

Two months ago, Anthropic unveiled Claude Mythos, an AI model it considered too capable to share with the public, especially at cybersecurity tasks like finding software vulnerabilities across major operating systems and browsers. On Tuesday, the company released essentially the same technology to anyone willing to pay for it.

Claude Fable 5, built on the same Mythos-class architecture, refuses to answer questions in high-risk areas like cybersecurity, biology, and chemistry. Instead, those queries get handed off to Claude Opus 4.8, a less capable model Anthropic launched last month. If the company suspects someone is trying to distill Fable 5 by training a smaller model on its responses, those requests get rerouted too.

mmo044.webp
Click to expand

"Fable 5's capabilities exceed those of any model we've ever made generally available," Anthropic said in announcing the model. "The longer and more complex the task, the larger Fable 5's lead over our other models." On some benchmarks, the company says Fable 5 scored more than 10% higher than Opus 4.8.

Outside testers echoed that. Analytics company Hex said Fable was the first model to hit 90% on its core analytics benchmark, and vibe-coding platform Base44 noted the model excels at "one-shotting full apps."

The safety system, while functional, is conservative by design. Anthropic's head of product management for research, Dianne Penn, told CNBC the company wanted to provide the technology "in a valuable fashion, and at the same time providing the right safety guardrails so that it can do asymmetrically more benefits than harm." In practice, that means the classifiers sometimes flag harmless requests, but Anthropic says more than 95% of Fable sessions run entirely on the model's own responses, with no fallback to Opus 4.8.

Anthropic stress-tested its safeguards before the release. The company says an external bug bounty produced no universal jailbreaks in over 1,000 hours of testing, and external red-teaming organizations also came up empty.

Still, the company is taking the unusual step of requiring 30-day retention on all traffic for its Mythos-class models, even for enterprises that previously had zero-retention agreements, to help defend against "complex and novel attacks (including new jailbreaks)". Anthropic says it will not use the data for training.

For customers who want the unrestricted version, Anthropic is also releasing Claude Mythos 5, the same underlying model with the safeguards lifted. Access is limited to the cyberdefenders and infrastructure providers already approved through Project Glasswing, Anthropic's cybersecurity consortium whose launch partners include Amazon Web Services, Apple, Google, Cisco, and Microsoft. The company says it plans to expand access over time through a "trusted access program."

Pricing for both Fable 5 and Mythos 5 is $10 per million input tokens and $50 per million output tokens, double the cost of Opus 4.8. Subscription access rolls out in stages: from launch through June 22, Fable 5 is included in Pro, Max, Team, and seat-based Enterprise plans at no extra cost. On June 23, Anthropic will remove it from those plans, and using it after that will require usage credits.

The launch comes about a week after Anthropic filed confidential paperwork for its initial public offering. Anthropic says it expects demand for Fable 5 to be "very high, and difficult to predict."

Share

More in News