Authors: Cloud Security Alliance AI Safety Initiative
Published: 2026-06-07

Categories: Vulnerability Management, AI Security, Open Source Security, Patch Management

AI Finds 21 FFmpeg Zero-Days for $1,000

Key Takeaways

Autonomous security startup depthfirst disclosed on June 6, 2026 that its AI agent discovered 21 previously unknown vulnerabilities in FFmpeg—the open-source multimedia library embedded in billions of devices—after scanning approximately 1.5 million lines of C code for a total compute cost of roughly $1,000 [1][2].
Nine of the flaws have received CVE identifiers (CVE-2026-39210 through CVE-2026-39218); the remainder have been fixed upstream but not yet formally numbered; the bugs are predominantly heap and stack overflows spanning the TS demuxer, VP9 decoder, DASH demuxer, and AV1 RTP depacketizer [1][3].
Several of the vulnerabilities had remained latent in widely deployed production code for between 15 and 23 years; one stack overflow in FFmpeg’s service-description-table parser was introduced in 2003 and persisted through more than two decades of security review by human researchers [1][2].
One of the disclosed flaws—a heap buffer overflow in FFmpeg’s AV1 RTP depacketizer—is exploitable over the network with no special preconditions, triggered by a single 183-byte packet when a user runs the common command ffmpeg -i rtsp://attacker/stream, providing an attacker a remote code execution primitive [1].
On the same day, Google shipped Chrome 149 with patches for a record 429 security bugs, the highest single-release patch count in Chrome’s history and more than any prior browser release that security trackers have catalogued, underscoring that AI-accelerated discovery is now producing findings at a volume that strains remediation infrastructure across the industry [2][3].
Enterprise mean time to remediate critical vulnerabilities has dropped to roughly 38 days under improved programs but remains structurally mismatched against an exploitation window that Mandiant’s M-Trends 2026 report now characterizes as negative for the most actively weaponized vulnerability classes—exploitation begins before a patch is available, on average—a gap that AI-accelerated discovery will continue to widen unless organizations fundamentally reconsider their remediation operating model [9][6].

Background

FFmpeg is among the most ubiquitous software components on the planet. First released in 2000, the project provides the foundational encoding, decoding, muxing, demuxing, and filtering capabilities that underlie consumer streaming platforms including YouTube and Netflix, communications tools including Zoom and Discord, media players including VLC, and an enormous range of embedded systems from television firmware to surveillance infrastructure [10]. Enterprise security teams frequently discover FFmpeg three or four layers deep in their dependency chains—incorporated by video processing frameworks that are themselves dependencies of application-layer software—and organizations that do not directly deploy FFmpeg are often unknowingly exposed through cloud services and managed platforms that rely on it internally. Thousands of organizations declare FFmpeg as a direct software dependency, and the true exposure surface including indirect consumption through media frameworks, cloud services, and embedded systems is substantially larger.

The vulnerability disclosure came from depthfirst, an early-stage security research startup whose primary product is an autonomous AI agent designed to perform the kind of deep, exploratory code analysis that traditionally requires senior human security researchers. Unlike fuzz-only approaches that execute code paths at high volume hoping to surface crash inputs, depthfirst’s agent reasons about code structure and data flow to identify the root cause of exploitable conditions and then generates reproducible proof-of-concept inputs confirming each finding. The company reported the FFmpeg engagement as an efficiency benchmark: the $1,000 figure represents the total cloud compute cost for the full scan, yielding 21 confirmed zero-days with PoCs [1][3]. For context, Anthropic’s own Claude Mythos model—a general-purpose frontier AI system used in Project Glasswing—required approximately $10,000 in compute for comparable FFmpeg analysis, suggesting that purpose-built, optimized autonomous agents can deliver competitive vulnerability discovery at one-tenth the cost of general-purpose frontier model deployments [3].

The FFmpeg disclosure arrives in the context of Project Glasswing, Anthropic’s coordinated program to use Claude Mythos Preview for defensive vulnerability research across systemically important open-source software. Project Glasswing scanned more than 1,000 open-source projects, flagging 23,019 potential vulnerabilities of which independent security firms confirmed 90.6 percent as real bugs among those independently reviewed; 6,202 were rated high or critical severity [5]. Partners in the program include Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks [4]. The Glasswing disclosures, combined with the depthfirst FFmpeg report and the concurrent Chrome 149 release, illustrate a structural mismatch now operating at scale: multiple independent actors—commercial research startups, hyperscaler AI programs, and automated tooling—are producing verified zero-day findings faster than open-source maintainers or enterprise patch pipelines were designed to absorb them.

Security Analysis

The Economics of AI-Driven Discovery

The $1,000 price point for 21 confirmed zero-days deserves careful economic analysis, because the number is not simply a curiosity—it marks a structural discontinuity in the vulnerability market. Prior to AI-driven autonomous discovery, producing a confirmed, exploitable zero-day in a mature, widely audited project like FFmpeg required thousands of hours of expert researcher time. Assuming senior security researcher rates of $250 to $350 per hour and an engagement of 600 to 1,400 hours for a codebase of this scope—consistent with published penetration testing industry benchmarks for large-scale C audits—a comparable human-led engagement would cost between $200,000 and $500,000. Depthfirst’s engagement produced a similar output at roughly 0.2 to 0.5 percent of that cost [1][2].

The cost compression is not an argument that security research has become cheaper across the board. Vulnerability triage, patch development, coordination with upstream maintainers, regression testing, and deployment remain labor-intensive and expensive human processes. The depthfirst result shows that the discovery half of the equation has moved to near-zero cost, while the remediation half has not. That asymmetry is the core problem for enterprise security economics. When discovery is cheap and remediation is expensive, the natural result is a growing backlog of known vulnerabilities without corresponding patches—exactly the condition the industry is now experiencing. As of late May 2026, only 97 of 1,596 vulnerabilities disclosed through the Glasswing process had been patched, a remediation rate of roughly six percent [6]. The other 94 percent had no publicly available patch as of that date [6], representing a known, confirmed exposure surface awaiting remediation.

Age of Vulnerabilities and What It Implies

The age distribution of the FFmpeg findings carries a message that security teams should internalize. A bug introduced in 2003 and discovered in 2026 is not evidence that FFmpeg’s developers were negligent; it is evidence that the human inspection capacity applied to even a widely used and security-conscious open-source project has been insufficient to find all the flaws that an AI system scanning the same codebase can surface in a single automated pass [1][2]. Human code review is expensive, attention-limited, and naturally concentrates on recently changed code rather than stable legacy routines. AI agents are not subject to the same attention-concentration effects as human reviewers and can traverse every function in a large codebase systematically—though they carry their own limitations, including sensitivity to semantic complexity, inability to reason about runtime and environment-dependent state, and tool-specific false-negative patterns that vary by vulnerability class.

The practical implication is that enterprise security teams should not assume the absence of prior CVEs in a component indicates the absence of exploitable conditions. For widely embedded C and C++ libraries with long histories—FFmpeg, OpenSSL, ImageMagick, libpng, libwebp, and comparable projects—the baseline assumption should shift to one of unconfirmed latent vulnerabilities until an AI-assisted audit has been completed. The Project Glasswing disclosures have already demonstrated this pattern at scale, with vulnerabilities found in major operating systems, web browsers, and foundational libraries that had persisted through years of human review and conventional fuzzing [4][5].

The Network-Reachable RCE Primitive

Among the 21 FFmpeg zero-days, the heap buffer overflow in the AV1 RTP depacketizer warrants particular attention from network defenders. Unlike many of the other findings—which require an attacker to supply a malicious media file that a user opens locally—this vulnerability is exploitable over the network against any process running FFmpeg with a live RTSP stream source. The attack is triggered by a single 183-byte packet; an attacker controls a streaming endpoint and a victim simply executes ffmpeg -i rtsp://attacker/stream, a command that appears in countless automated media processing pipelines, broadcast ingest systems, security camera monitoring setups, and developer testing workflows [1]. Depthfirst confirmed a working remote code execution primitive in its research writeup.

The exposure profile for this particular flaw is broader than the CVE numbering alone suggests. FFmpeg is frequently invoked in server-side media processing pipelines that accept RTSP or RTMP sources from external users or from the network perimeter. A media transcoding service, a broadcast platform’s ingest layer, or a security operations center’s camera feed aggregator may expose this vulnerability to network-accessible untrusted input without any user interaction beyond the pipeline’s normal operation. Organizations running FFmpeg in server-side contexts should treat this as a priority remediation item requiring immediate action rather than standard patch cycle scheduling.

Chrome 149 and the Systemic Volume Problem

The simultaneous release of Chrome 149 with 429 patched vulnerabilities—the highest single-release patch count in Chrome’s history [2][3]—illustrates the same dynamics at the browser layer that the FFmpeg disclosure illustrates at the library layer. Of the 429 bugs, more than 100 were rated critical or high severity. The single most severe flaw, CVE-2026-10881, scored 9.6 on the CVSS scale and provided a sandbox escape capability through an out-of-bounds read/write in Chrome’s ANGLE graphics engine, earning a $97,000 bug bounty payment [2].

Google revamped its Vulnerability Reward Program in April 2026 specifically to manage the volume of AI-generated submissions, as reported in CSA Labs’ May 2026 analysis [6]. HackerOne’s Internet Bug Bounty program experienced a more severe consequence: the valid submission rate fell from 15 percent to below 5 percent as AI tools flooded the platform with plausible but unconfirmed reports, ultimately leading HackerOne to pause new bug submissions entirely in March 2026 [6]. The operational overhead of triaging high volumes of low-confidence AI-generated reports is consuming security engineering capacity that would otherwise go to remediation—a second-order effect of the discovery economics shift that compounds the backlog problem.

The Patch Economics Mismatch

The structural mismatch between discovery economics and remediation economics is now well documented. Synack’s 2026 State of Vulnerabilities Report, analyzing more than 11,000 vulnerabilities, found that even organizations with mature vulnerability management programs achieved a mean time to remediate of 38 days for critical vulnerabilities, down from 63 days in prior periods [7]. Against an exploitation timeline that Mandiant’s M-Trends 2026 report characterizes as going negative—with exploitation beginning before a patch is even available, on average, for actively weaponized vulnerability classes [9]—a 38-day MTTR means organizations are exposed from the moment a vulnerability becomes weaponizable through the full remediation cycle. That exposure window grows considerably when adversaries operate their own AI-assisted discovery tools, since the assumption of simultaneous discovery no longer holds: an adversary who identifies a flaw before public disclosure has an unconstrained exploitation window.

IBM’s 2025 Cost of a Data Breach Report puts the global average breach cost at $4.44 million, with U.S. organizations averaging $10.22 million [8]. Those figures reflect the total cost defenders absorb—detection, escalation, notification, response, and regulatory exposure—and should not be conflated with the direct economic return to an adversary. For financially motivated attackers, the relevant metric is vulnerability market pricing: a confirmed, network-reachable RCE in a widely deployed component can command prices ranging from tens of thousands to over one million dollars on brokerage markets, creating a substantial return on a $1,000 discovery investment before any direct exploitation is attempted. The asymmetry between discovery economics and remediation economics is not merely an operational challenge—it represents a structural advantage for offense that enterprise security programs must explicitly account for in their risk posture.

Recommendations

Immediate Actions

Organizations that use FFmpeg in any capacity—as a direct dependency, through a media processing framework, or via a cloud service—should inventory their exposure and prioritize patching the nine CVEs that have received formal assignment (CVE-2026-39210 through CVE-2026-39218). The AV1 RTP depacketizer flaw in particular should be treated as a critical-priority remediation item for any FFmpeg deployment that accepts RTSP or RTP input from untrusted sources. Security teams should check the FFmpeg project’s release notes and upstream patch commits to confirm that the remaining twelve unfixed or unnumbered bugs have been incorporated into the version of FFmpeg their deployments use.

For Chrome deployments, organizations should accelerate rollout of Chrome 149 to end-user workstations. The CVSS 9.6 sandbox escape (CVE-2026-10881) represents a severe risk for any environment where users browse untrusted content, and the record scope of the 429-bug release means that prior browser versions carry an unusually large aggregate exposure surface. Endpoints that rely on managed update mechanisms should be audited to confirm that Chrome 149 is not pending in an update queue.

Short-Term Mitigations

Organizations should audit their third-party software bill of materials (SBOM) for FFmpeg as a transitive dependency. Many products that include FFmpeg do not surface it prominently in their dependency disclosures; proactive vendor outreach to confirm patch status for CVE-2026-39210 through CVE-2026-39218 is warranted wherever FFmpeg is consumed indirectly. Organizations that process media from untrusted external sources should evaluate whether network-layer controls—such as restricting RTSP/RTP traffic to known-good sources or running media processing in isolated sandbox environments—can reduce their exposure while patches are staged.

Vulnerability management programs should revisit their MTTR targets in light of current exploitation timelines. A 38-day average MTTR was a meaningful improvement in prior years but is no longer sufficient as a performance benchmark when exploitation begins before patch availability for the most severely weaponized vulnerabilities. Teams should consider a tiered approach that applies emergency 24-to-48-hour remediation cycles to network-reachable, proof-of-concept-confirmed vulnerabilities in widely deployed components, reserving standard patch cycle scheduling for lower-risk findings.

Strategic Considerations

The depthfirst FFmpeg engagement should serve as a planning scenario for enterprise security leadership. At $1,000 per 21 zero-days, the economics now support adversaries running continuous, automated scans across an organization’s entire software dependency tree. Organizations should operate under the assumption that unknown vulnerabilities in their third-party software components are being actively sought by adversarial AI tools and plan accordingly. This has direct implications for defense-in-depth architecture: the traditional model of treating patching as the primary mitigation and runtime controls as secondary must be rebalanced toward layered detection and containment that remains effective when a patch does not yet exist.

Security teams should evaluate AI-assisted vulnerability discovery programs for their own use. The same tooling that makes offensive discovery inexpensive makes defensive pre-disclosure discovery feasible for organizations that want to find their exposure before adversaries do. Several vendors now offer AI-driven SAST and software composition analysis tools that apply approaches similar to the depthfirst methodology to customer codebases. Investing in these capabilities for critical internally developed software and for high-priority open-source dependencies can allow organizations to identify and remediate issues before public disclosure.

Support for the open-source software ecosystem requires explicit organizational investment rather than a passive expectation that volunteer maintainers will absorb the additional remediation workload created by AI-accelerated discovery. Contributing engineering resources to OpenSSF-funded projects, funding maintainer time through programs like GitHub Sponsors or Tidelift, and establishing direct communication channels with the maintainers of critical dependencies are concrete actions that address the structural mismatch between discovery volume and remediation capacity at its source.

CSA Resource Alignment

The FFmpeg zero-day cluster and the broader AI discovery acceleration trend engage several CSA frameworks and publications directly. The CSA AI Controls Matrix (AICM) v1.0 addresses the supply chain security and software dependency risks that make FFmpeg’s ubiquity a systemic concern; AICM’s domain on AI system integrity includes controls for verifying that AI-enabled pipelines operate on patched and audited third-party dependencies. The MAESTRO threat modeling framework for agentic AI systems is directly applicable to the autonomous scanning systems that depthfirst and Glasswing have deployed: organizations evaluating whether to deploy similar AI vulnerability discovery agents should use MAESTRO to characterize the trust boundaries, output validation requirements, and human-in-the-loop checkpoints appropriate for such systems.

The CSA STAR program provides a structure for cloud service providers to disclose their vulnerability management posture, including their use of AI-assisted discovery tools and their MTTR commitments for critical-severity findings. As AI-accelerated vulnerability discovery becomes an industry-standard practice, STAR assessments should increasingly probe whether cloud providers are applying these tools defensively and what their patch deployment timelines look like for widely embedded components like FFmpeg. CSA’s Zero Trust guidance is also relevant: the network-reachable AV1 RTP depacketizer vulnerability illustrates why zero trust network segmentation and process isolation remain essential countermeasures even when patch timelines compress—runtime isolation limits the blast radius of an exploitation event that occurs before a patch is available.

CSA Labs published a dedicated research note on the AI vulnerability disclosure velocity crisis in May 2026, examining the structural mismatch between Glasswing’s disclosure rate and the open-source ecosystem’s remediation capacity [6]. The FFmpeg findings described here represent a continuation and intensification of that trend, with an additional dimension: the $1,000 cost figure establishes a new baseline for the economic feasibility of adversarial discovery that the prior note’s analysis had not yet quantified.

References

[1] depthfirst. “21 Zero-Days in FFmpeg.” depthfirst Research, June 2026.

[2] The Hacker News. “AI Agent Uncovers 21 Zero-Days in FFmpeg; Chrome Patches Record 429 Bugs.” The Hacker News, June 6, 2026.

[3] The Next Web. “An AI agent found 21 zero-days in FFmpeg for $1,000. Chrome just patched a record 429 bugs.” The Next Web, June 6, 2026.

[4] Anthropic. “Project Glasswing.” Anthropic, 2026.

[5] Anthropic. “Glasswing Initial Update.” Anthropic Research, 2026.

[6] CSA AI Safety Initiative. “Project Glasswing and the AI Vulnerability Disclosure Velocity Crisis.” CSA Labs, May 2026.

[7] Synack. “The 2026 State of Vulnerabilities Report: Industry Insights.” Synack, 2026.

[8] IBM. “2025 Cost of a Data Breach.” IBM Security, 2025.

[9] Mandiant. “M-Trends 2026.” Google Cloud, 2026.

[10] Wikipedia. “FFmpeg.” Wikimedia Foundation, accessed June 2026.

← Back to Research Index