Microsoft is now making competing AI models work together, pairing OpenAI's GPT with Anthropic's Claude in a new quality-control system that aims to reduce hallucinations and improve accuracy. The company unveiled its "Critique" feature on Monday, which forces GPT-generated responses through Claude's verification filter before presenting them to users.
In this workflow, OpenAI's model creates initial drafts while Anthropic's system checks for factual accuracy and quality issues.
Nicole Herskowitz, corporate vice president of Microsoft 365 and Copilot, told Reuters the multi-model approach will speed up user workflows while keeping AI hallucinations in check. The company plans to make the verification process bi-directional eventually, allowing GPT to review Claude's drafts as well.
A second feature called 'model Council' lets users compare responses from different AI models side-by-side. Both upgrades arrive as Microsoft expands early access to its Copilot Cowork tool, which automates multi-step tasks across Microsoft 365 applications.
Copilot Cowork represents Microsoft's answer to Anthropic's viral Claude Cowork product, which the Windows maker has been testing since earlier this month. The tool targets enterprise customers through Microsoft's Frontier program, providing select users with early access to emerging AI capabilities.
"Having various different models from different vendors in Copilot is highly attractive - but we're taking this to the next level, where customers actually get the benefits of the models working together," Herskowitz said in her interview.
The move comes amid competition in the enterprise AI space, where Google's Gemini and autonomous agents like Claude Cowork have challenged Microsoft's Copilot dominance. By integrating Anthropic's technology directly into its ecosystem, Microsoft aims to neutralize competitive threats while improving output reliability.
The Windows maker has been racing to improve adoption of its Copilot assistant as enterprise customers increasingly demand more accurate and verifiable AI outputs.















