Microsoft’s MDASH AI System Found 16 Windows Vulnerabilities Fixed in Patch Tuesday

Microsoft's MDASH AI system, using adversarial agents, discovered 16 Windows flaws in Patch Tuesday, including critical RCE bugs in networking and authentication.

May 14, 2026
5 min read
Technobezz
Microsoft’s MDASH AI System Found 16 Windows Vulnerabilities Fixed in Patch Tuesday

Don't Miss the Good Stuff

Get tech news that matters delivered weekly. Join 50,000+ readers.

An AI system that argues with itself found 16 Windows vulnerabilities patched in this month's Patch Tuesday, including four critical remote code execution bugs Microsoft says are likely to be exploited.

Microsoft's MDASH, short for multi-model agentic scanning harness, orchestrates more than 100 specialized AI agents across frontier and distilled models, each with a distinct role. Auditor agents flag suspicious code paths.

Debater agents challenge those findings. A finding only survives if it passes cross-examination.

"Disagreement between models is itself a signal," Microsoft explained in a blog post. "An auditor does not reason like a debater, which does not reason like a prover." The system uncovered flaws in Windows' core networking and authentication stack, TCP/IP kernel networking, IKEv2 (the VPN keying service), Netlogon, and the DNS API library. Ten of the 16 vulnerabilities reside in kernel mode; six are user mode. The majority are reachable from a network position with no credentials required.

Two critical RCEs stand out. CVE-2026-33824 is a double-free vulnerability in ikeext.dll with a CVSS score of 9.8. A shallow memcpy during fragment reassembly leaves two code paths holding the same heap pointer, both eventually free it. The bug spans six source files.

Because IKEEXT runs as LocalSystem within svchost.exe, successful exploitation grants full system compromise before authentication.

CVE-2026-33827 (CVSS 8.1) is a use-after-free in the Windows kernel TCP/IP stack triggered by specially crafted IPv4 packets with the SSRR routing option. The vulnerable pointer release and its reuse are separated by multiple validation checks and alternate control flow branches, no single function view connects them.

MDASH caught it by cross-referencing analogous patterns across the codebase. The system was built by Microsoft's Autonomous Code Security Team, which draws from Team Atlanta at Georgia Tech, winners of a $20 million prize in DARPA's AI Cyber Challenge. Taesoo Kim, Microsoft's VP of agentic security, leads the team. On a private Windows driver called StorageDrive seeded with 21 deliberately injected vulnerabilities, MDASH identified all 21 with zero false positives. Because StorageDrive is a private codebase never publicly released, the test minimized the chance AI models had seen the code during training.

"This simple test shows that the reasoning and vulnerability discovery capabilities of codename MDASH can approximate professional offensive researchers," Kim said. On the public CyberGym benchmark, 1,507 real-world vulnerabilities from 188 open-source projects, MDASH scored 88.45%, topping the leaderboard by roughly five points over the next competitor. It also achieved 96% recall against five years of confirmed Microsoft Security Response Center vulnerabilities in clfs.sys and 100% recall in tcpip.sys. The system is model-agnostic by design. Swapping in a better model is a configuration change, not a rebuild.

MDASH is currently in limited private preview with customers and Microsoft's own security engineering teams, with plans to open to enterprise customers next month.

May's Patch Tuesday addresses roughly 140 CVEs total, 17 rated Critical, with no zero-days for the first time in months. But the absence of actively exploited flaws doesn't mean security teams can relax. The Secure Boot certificate expiration deadline on June 26 gives organizations roughly 45 days to deploy updated certificates before Windows devices enter a degraded security state.

"AI vulnerability discovery has crossed from research curiosity into production-grade defense at enterprise scale," Kim said. "The durable advantage lies in the agentic system around the model rather than any single model itself."

Share