Microsoft MDASH agentic AI security system tops vulnerability discovery benchmarks

admin

May 15, 2026 - 02:29

0 0

Microsoft MDASH agentic AI security system tops vulnerability discovery benchmarks

Microsoft has described a multi-model agentic AI security system, codenamed MDASH, designed to support vulnerability discovery and cybersecurity research across complex codebases.

According to Microsoft, the system helped researchers identify 16 vulnerabilities across Windows networking and authentication components, including issues in the Windows TCP/IP stack, IKEv2 services, DNS handling and Netlogon processes. Several of the vulnerabilities were reachable over networks without authentication, the company said.

MDASH was developed by Microsoft’s Autonomous Code Security team and combines more than 100 specialised AI agents with an ensemble of frontier and distilled AI models. The system is structured as a multi-stage pipeline covering code preparation, scanning, validation, deduplication and proof generation.

The publication says the system identified remote code execution flaws, denial-of-service issues, information disclosure vulnerabilities and security feature bypasses. Microsoft also described the use of specialised auditor, debater and prover agents designed to analyse vulnerabilities across multiple files and code paths.

Microsoft said MDASH uses plugins and domain-specific knowledge to support validation and proof-of-concept generation, allowing security experts to add context that foundation models may not capture on their own.

The company also reported benchmark results from internal and public tests. It said MDASH identified all 21 deliberately inserted vulnerabilities in a private test driver with zero false positives in that run, achieved 96% recall against five years of confirmed Microsoft Security Response Center cases in clfs.sys and 100% in tcpip.sys, and scored 88.45% on the public CyberGym benchmark.

Microsoft said the system is already being used by its security engineering teams and is being tested with a small group of customers through a limited private preview.

Why does it matter?

MDASH shows how agentic AI is moving into high-value cybersecurity tasks such as vulnerability discovery, validation and proof generation. If systems like this can reliably reduce false positives and help researchers find exploitable flaws earlier, they could improve defensive security at scale. The same development also raises governance questions around access, oversight and dual-use risk, since tools capable of finding and proving vulnerabilities may be valuable to both defenders and attackers.

The company also discussed broader implications for AI-assisted cybersecurity operations, including the use of agentic AI systems for vulnerability discovery, validation, and remediation workflows. Microsoft stated that the system is currently being tested internally and through a limited private preview involving selected customers.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!