Authors: Cloud Security Alliance AI Safety Initiative
Published: 2026-03-17

Categories: Vulnerability Management, Agentic AI Security, Security Operations, Threat Intelligence

Noise Over Signal: AI Agents Flood Disclosure Pipelines

Key Takeaways

The vulnerability disclosure ecosystem — bug bounty platforms, CVE numbering infrastructure, and open-source project maintainers — has experienced a significant increase in AI-generated reports that appear to be reducing signal quality for effective security triage. AI-assisted and AI-generated vulnerability submissions have overwhelmed multiple disclosure channels. The curl project shut down its HackerOne bug bounty program in January 2026 after 95% of 2025 submissions proved invalid, with volume running eight times above historical norms [1][2]. Bugcrowd recorded a 334% spike in submission queue length over three weeks attributable to unvalidated AI automation [3].

CVE publication volume reached 48,185 in 2025 — a ninth consecutive record year — while NVD’s enrichment analysis capacity covered only 28% of newly disclosed entries, down from 46.2% in 2024 [4][5]. FIRST’s 2026 forecast projects a median of 59,427 CVEs for the current year, with upper-bound scenarios exceeding 117,000 [6]. CISA formally acknowledged the crisis in September 2025, announcing a strategic transition of the CVE program from a “Growth Era” to a “Quality Era” emphasizing enrichment standards and signal fidelity over raw volume [7].

The same AI capabilities fueling the noise problem are producing genuinely high-quality vulnerability research. AISLE’s AI-driven OpenSSL audit (February 2026) identified 12 previously unknown vulnerabilities including a 27-year-old bug [8], and OpenAI’s Aardvark system disclosed findings that yielded 10 CVE identifiers [9]. The disclosure signal crisis is one of discipline and governance, not of AI capability itself.

These disclosure pipeline pressures coincide with — and compound — a persistent SOC alert fatigue problem. False positive rates already exceed 46% of all alerts, and industry studies report that between 40% and 63% of daily alerts go uninvestigated [10][11]. The relationship is correlational rather than causally established: SOC alert fatigue has well-documented independent drivers in detection rule proliferation and tool sprawl, but disclosure signal degradation intensifies analyst workload in environments already stretched by alert volume.

Background

The Architecture of Vulnerability Disclosure

Vulnerability disclosure operates through layered infrastructure whose components are interdependent. At the base, security researchers and automated tools identify weaknesses in software systems and submit findings through structured channels: vendor-operated security advisory contacts, bug bounty platforms, CVE Numbering Authorities (CNAs), and directly to open-source project maintainers. These submissions flow upward through triage, validation, and enrichment processes before becoming actionable intelligence for defenders. The MITRE-administered CVE program, funded by CISA, underpins the entire ecosystem by providing the canonical identifier system that downstream tools, patch management systems, and threat intelligence platforms depend on [4]. NVD, operated by NIST, enriches those CVE records with severity scores, affected product data, and exploit references [5].

Each layer of this pipeline depends on a favorable signal-to-noise ratio. A triage analyst, a CNA coordinator, or an open-source maintainer allocates a fixed amount of human time per incoming report. That time budget is calibrated against historical submission volumes and the expectation that a meaningful fraction of reports reflect genuine, reproducible vulnerabilities. When submission volume rises dramatically while validity rates collapse, the pipeline’s human capacity is consumed by filtering noise rather than advancing genuine findings. This is not a novel problem — security teams have long faced alert fatigue and duplicate CVE submissions — but the economics of AI-assisted report generation have accelerated the dynamic to a scale that threatens the pipeline’s functional integrity.

How AI Agents Changed the Economics of Reporting

Prior to the widespread availability of large language model-based tools, generating a plausible-looking vulnerability report required meaningful technical skill, manual code review time, and domain-specific knowledge of the target system’s threat model. These friction costs acted as a natural filter: the effort required to produce a report was high enough that trivially false or speculative findings were rarely submitted to formal channels. A researcher submitting a low-quality finding risked reputation damage that exceeded any potential bounty reward.

Agentic AI systems have dramatically reduced this friction for low-quality submissions. Contemporary AI agents can analyze source code repositories, web application endpoints, or binary artifacts at scale, automatically generate report text describing apparent weaknesses, and submit those reports through programmatic APIs — without human review of each individual output. The cost of generating a plausible-looking report has fallen dramatically — from hours of skilled research time to minutes of agentic pipeline execution. Incentives have simultaneously inverted: bug bounty financial rewards mean that even a small percentage of valid findings from a high-volume automated pipeline can be economically rational, even if 95% of submissions waste maintainer time and platform triage capacity [2][3]. Seth Larson, Security Developer-in-Residence at the Python Software Foundation, identified this dynamic in December 2024: “Whatever happens to Python or pip is likely to eventually happen to more projects” [12]. By early 2026, that prediction had materialized across multiple high-profile disclosure channels simultaneously.

Security Analysis

The Bug Bounty Platform Crisis

The most visible evidence of disclosure signal degradation comes from commercial bug bounty platforms and the open-source projects they serve. The curl project’s experience offers the most documented case study. Daniel Stenberg, curl’s founder, first publicly documented AI-assisted report flooding in January 2024. By mid-2025 roughly 20% of all submissions were categorized as AI-generated noise, and as of that date zero valid vulnerabilities had originated from AI-assisted reports during the program’s documented history, per Stenberg’s public documentation [1][2]. Submission volume had spiked to eight times the project’s historical baseline. In January 2026, Stenberg announced the program’s termination: “The never-ending slop submissions take a serious mental toll to manage… time and energy that is completely wasted” [2]. The program was eventually reconstituted without monetary rewards, deliberately eliminating the financial incentive structure that made automated bulk submission rational.

Bugcrowd’s experience illustrates how the problem manifests at platform scale rather than at individual program level. The platform identified three source categories producing low-quality automated submissions: organizations training AI reinforcement learning systems through sock-puppet submissions, novice researchers deploying AI agents without manual validation steps, and fully automated submission pipelines with no human in the loop [3]. Bugcrowd’s policy response — permanent bans for submission farming, 30-day suspensions for accounts with 10 or more consecutive invalid reports, and identity verification requirements for repeat offenders — introduces enforcement overhead that did not exist before automated submission volumes rose [3].

HackerOne’s data provides the clearest quantitative picture of the trend at industry scale. The platform’s 2025 Annual Hacker-Powered Security Report documented a 210% year-over-year increase in AI-related vulnerability reports and noted that 560 or more valid reports had been submitted by fully autonomous systems over the prior year [13]. HackerOne co-founder Michiel Prins acknowledged the accompanying noise problem directly: “a rise in false positives — vulnerabilities that appear real but are generated by LLMs and lack real-world impact.” The platform’s response, a hybrid AI-human triage system called Hai Triage, reflects an emerging pattern of using automation to manage automation-generated noise — a dynamic that raises long-term questions about the sustainability of the approach [13].

CVE Infrastructure Under Strain

The same volume pressures bearing on commercial bug bounty programs are compressing CVE publication and enrichment infrastructure at a structural level. CVE publication has set records for nine consecutive years: 40,009 in 2024 (a 32% increase over 2023) and 48,185 in 2025 (a further 20.6% increase) [4][5]. FIRST’s February 2026 forecast projects a 2026 median of 59,427 CVEs, with a 90% confidence interval extending to 117,673 [6]. While AI-generated reports are not the sole driver of this growth — expanded CNA onboarding, greater vendor participation, and increased vulnerability research activity all contribute — available evidence suggests the AI component may be a material and accelerating factor, though its precise contribution to CVE volume growth has not been independently quantified.

NVD’s enrichment capacity has not kept pace. In 2025, NVD fully analyzed only 28% of newly disclosed CVEs, compared to 46.2% in 2024 [5]. Approximately 54,914 CVEs from the 2024–2025 period remain awaiting enrichment, and nearly 100,000 CVEs now carry a “Deferred” designation in NVD’s queue [4][5]. This enrichment gap has direct operational consequences for defenders: vulnerability scanners, patch management tools, and threat intelligence platforms that rely on NVD CVSS scores and product enumeration data are likely to encounter increasingly incomplete records absent compensating enrichment from CISA Vulnrichment or alternative sources, reducing their ability to prioritize remediation accurately.

CISA has responded by launching Vulnrichment, an independent enrichment program that supplements NVD data and publishes its outputs through CISA’s own feeds [7]. The September 2025 “CVE Quality for a Cyber Secure Future” policy document formalizes CISA’s pivot away from NVD as a single authoritative source, explicitly transitioning the CVE program from a Growth Era orientation — maximizing identifier assignment volume — to a Quality Era emphasis on enrichment standards, validation rigor, and federated data governance [7]. The CVE Board’s October 2025 meeting minutes document parallel deliberation at the CNA governance level, acknowledging “a rising volume of low quality or unconfirmed vulnerability reports, including AI assisted findings, black box web scans, duplicates, student assignment driven submissions, and claims against end-of-life software that are hard to validate” [14]. Proposed responses including validation signaling and evidence requirements were routed to the CVE AI Working Group for further development, with no binding restrictions implemented as of March 2026 [14].

Open-Source Maintainers: The Most Exposed Tier

While platforms like HackerOne and Bugcrowd have commercial resources to invest in triage automation and policy enforcement, open-source project maintainers — many of them volunteers sustaining projects used by millions of systems — have no equivalent capacity buffer. Christopher Robinson, CTO of the Open Source Security Foundation (OpenSSF), confirmed in March 2026 that the frequency of cases “in which reporters cannot answer maintainers’ follow-up questions” is increasing, and that projects previously receiving two to three security reports per week are now receiving hundreds simultaneously [15]. Given that qualified security triage typically requires hours of careful review per report, such volume is not compatible with volunteer-maintainer capacity structures.

The asymmetry Robinson identifies is fundamental: AI systems can generate reports at near-zero marginal cost using essentially unlimited compute, while human maintainers must evaluate each report with irreducibly finite time. Jeff Geerling, managing more than 300 open-source repositories, captured this structural imbalance in early 2026: “AI companies have unlimited resources to generate submissions, while human maintainers must review everything with finite time and energy” [16]. GitHub’s introduction of controls allowing maintainers to disable pull requests entirely — which Geerling and other maintainers have adopted specifically to address AI flooding — reflects the severity of the threat to open-source project viability [16].

The Dual-Use Paradox and Its Governance Implications

The disclosure pipeline signal crisis does not arise because AI is inherently incapable of producing quality vulnerability research. The evidence suggests the opposite. AISLE’s AI agent identified 12 previously unknown OpenSSL vulnerabilities in February 2026, including a flaw that had been present in the codebase for 27 years; OpenSSL’s maintainers specifically praised the quality of AISLE’s reports and coordination [8]. OpenAI’s Aardvark autonomous security researcher identified vulnerabilities in open-source projects resulting in 10 CVE assignments, using sandboxed exploit confirmation before disclosure to reduce false positives [9]. The DEF CON 33 AI Cyber Challenge in 2025 saw competing AI systems discover 54 unique vulnerabilities across 54 million lines of code, with all genuine findings being responsibly disclosed [17].

These high-quality examples share characteristics absent from the noise-generating systems: pre-submission technical validation, sandbox-based exploit confirmation, human review gates, and coordination with maintainers before public disclosure. The crisis is therefore one of governance and incentive alignment, not of AI technical capability. The same technology that produced three-decades-old OpenSSL bug discoveries is generating thousands of low-quality reports because the economic and structural incentives — bounty rewards, lack of quality enforcement, zero submission cost — reward volume over validity. Addressing the crisis requires governance intervention at the incentive layer, not technical prohibition of AI-assisted research.

Recommendations

Immediate Actions

Organizations operating bug bounty programs should audit submission patterns for signals of automated low-quality flooding: templated language, inability to reproduce findings with provided reproduction steps, and lack of response to maintainer follow-up questions. Programs experiencing these patterns should consider graduated enforcement responses — extended validation windows, temporary holds on accounts with elevated invalid ratios, or structured declaration requirements for AI tool use — before escalating to program suspension. The curl project’s experience suggests that eliminating monetary rewards for low-value finding categories may serve as a targeted friction mechanism [2], though evidence from a single program is insufficient to generalize across all bug bounty contexts.

Security teams consuming vulnerability data from NVD-integrated tools should treat NVD enrichment gaps as a structural risk rather than a temporary operational issue. CISA’s Vulnrichment feeds, ENISA’s European Vulnerability Database (EUVD), and vendor-specific security advisories should be incorporated as parallel enrichment sources to compensate for NVD analysis backlogs [7][18]. Patch prioritization workflows should be validated against multiple enrichment sources before treating a CVE as low-severity based on incomplete NVD records.

Short-Term Mitigations

Bug bounty platforms and CNAs should implement structured evidence requirements for AI-assisted submissions: mandatory reproduction steps, sandboxed exploit demonstration, and attestation of whether AI tools were used in report preparation. These requirements serve a filtering function without prohibiting AI-assisted research from researchers capable of validating their outputs. Bugcrowd’s three-category taxonomy — RL training systems, unvalidated novice automation, and fully autonomous pipelines — provides a useful model for crafting policy language that targets specific risk sources rather than AI use broadly [3].

SOC teams should treat CVE enrichment gaps as a factor in alert prioritization algorithms. Vulnerability data from unenriched CVE records, or CVEs marked “Deferred,” carries lower baseline confidence and should be weighted accordingly in risk-scoring systems. Human analyst time should be biased toward findings with multi-source enrichment confirmation. Investing in CVSS alternatives such as EPSS (Exploit Prediction Scoring System) and SSVC (Stakeholder-Specific Vulnerability Categorization) provides richer prioritization signal than base CVSS scores alone [19].

Strategic Considerations

The long-term integrity of vulnerability disclosure infrastructure requires governance mechanisms that create accountability for submission quality — analogous to the accountability structures that exist in responsible disclosure frameworks for individual researchers. The CVE AI Working Group and the broader CNA community should move beyond deliberation toward binding quality standards for AI-assisted CVE submissions, including minimum evidence requirements, validation attestation, and consequences for systematic low-quality submissions from specific CNA members or reporters. CISA’s Quality Era framework provides a credible policy basis for these standards [7].

Organizations whose open-source projects form part of their software supply chain should assess whether those projects have maintainer capacity sufficient to absorb current disclosure volumes. Supply chain risk management frameworks should extend beyond code dependency analysis to include maintainer health metrics: report-to-maintainer ratios, response time degradation, and program viability signals such as the curl program shutdown. Sponsoring maintainer security capacity — through OpenSSF’s Alpha-Omega initiative or direct engineering contributions — reduces supply chain exposure while supporting disclosure ecosystem sustainability [15].

CSA Resource Alignment

This research note connects to several Cloud Security Alliance frameworks and publications that provide organizations with structured approaches to managing AI-driven vulnerability management challenges.

MAESTRO (Multi-Agent Environment Simulation for Threat, Risk, and Operationalization): MAESTRO’s threat modeling for autonomous AI agents directly addresses the agentic behaviors driving disclosure pipeline flooding. Organizations designing AI vulnerability research capabilities should apply MAESTRO threat modeling to characterize the blast radius of autonomous submission behavior before deployment, ensuring that disclosure rate controls, validation gates, and human-in-the-loop requirements are treated as security requirements rather than optional quality controls.

CSA AI Vulnerability Taxonomy: CSA’s AI Vulnerability Taxonomy, extending CVE and CWE frameworks with AI-specific vulnerability categories, provides a classification foundation for the novel failure modes emerging from AI-generated disclosure: hallucinated vulnerabilities, automation-amplified submission storms, and quality degradation in AI-CNA interfaces. Security teams developing internal AI-assisted vulnerability research programs should align their classification and reporting standards with this taxonomy.

CSA Benchmark Study of AI Agents in the SOC: CSA research on AI agent performance in SOC environments [20] provides empirical grounding for the downstream alert fatigue problem documented in this note. The study’s finding that AI-assisted triage produces faster and more consistent investigation outcomes supports investing in human-AI collaborative triage models as a countermeasure to noise-driven analyst overload, rather than treating AI as solely a source of the problem.

Top Concerns with Vulnerability Data: CSA’s research examining systemic challenges in CVE and CVSS data quality [19] anticipated many of the structural issues this note documents at greater scale. The alternative scoring frameworks evaluated in that research — EPSS, SSVC, VPR — offer practical paths for organizations whose patch prioritization is degraded by NVD enrichment gaps.

CCM (Cloud Controls Matrix): CCM controls under the Threat and Vulnerability Management (TVM) domain apply directly to the processes this note addresses. TVM-01 through TVM-09 establish requirements for vulnerability identification, remediation tracking, and patch management that should be updated to explicitly account for AI-generated report quality as an input data integrity concern.

AI Organizational Responsibilities: CSA guidance on organizational accountability for AI systems supports the governance framing of this note’s recommendations. The crisis documented here is not primarily a technical failure but a governance failure: AI systems are generating reports without accountability structures that would exist if human researchers made the same submissions. Organizational AI policies should address submission volume controls and validation requirements as part of AI system governance, not as afterthoughts.

References

[1] Daniel Stenberg, “AI Is Ruining Our Bug Bounty Program,” curl blog / The Register, May 7, 2025. https://www.theregister.com/2025/05/07/curl_ai_bug_reports/

[2] Daniel Stenberg, “curl Ends Bug Bounty Program,” The Register / BleepingComputer, January 21, 2026. https://www.theregister.com/2026/01/21/curl_ends_bug_bounty/; https://www.bleepingcomputer.com/news/security/curl-ending-bug-bounty-program-after-flood-of-ai-slop-reports/

[3] Bugcrowd Security Team, “Bugcrowd Policy Changes to Address AI Slop Submissions,” Bugcrowd Blog, 2025. https://www.bugcrowd.com/blog/bugcrowd-policy-changes-to-address-ai-slop-submissions/

[4] Socket Security Research Team, “CVE Volume Surges Past 48K in 2025,” Socket.dev Blog, 2026. https://socket.dev/blog/cve-volume-surges-past-48k-in-2025

[5] SecurityWeek, “NIST Still Struggling to Clear Vulnerability Submissions Backlog in NVD,” SecurityWeek, 2025. https://www.securityweek.com/nist-still-struggling-to-clear-vulnerability-submissions-backlog-in-nvd/; Zafran Security, “The 2025 Spike in Vulnerabilities Isn’t the Full Story,” 2025. https://www.zafran.io/resources/the-2025-spike-in-vulnerabilities-isnt-the-full-story

[6] FIRST, “Vulnerability Forecast 2026,” FIRST Blog, February 11, 2026. https://www.first.org/blog/20260211-vulnerability-forecast-2026

[7] CISA, “CISA Presents Vision for Common Vulnerabilities and Exposures (CVE) Program,” CISA News, September 10, 2025. https://www.cisa.gov/news-events/news/cisa-presents-vision-common-vulnerabilities-and-exposures-cve-program

[8] Cloud Security Alliance AI Safety Initiative, “AISLE OpenSSL Zero-Day Discovery: AI-Assisted Vulnerability Research,” CSA Research Note, February 2026. [CSA corpus reference]

[9] OpenAI, “Introducing Aardvark,” OpenAI Blog, October 31, 2025. https://openai.com/index/introducing-aardvark/; The Register, “OpenAI’s Aardvark Agentic Security Researcher,” October 31, 2025. https://www.theregister.com/2025/10/31/openai_aardvark_agentic_security/

[10] Microsoft / Omdia, “State of the SOC 2026,” Microsoft Security, 2026. Referenced in CyberDefenders, “SOC Alert Fatigue,” https://cyberdefenders.org/blog/soc-alert-fatigue/

[11] Dropzone AI, “Alert Fatigue in Cybersecurity: Definition, Causes, Modern Solutions,” Dropzone AI Glossary, 2025–2026. https://www.dropzone.ai/glossary/alert-fatigue-in-cybersecurity-definition-causes-modern-solutions-5tz9b

[12] Seth Larson, “LLM-Hallucinated Security Reports Are Rising,” Python Software Foundation / The Register, December 10, 2024. https://www.theregister.com/2024/12/10/ai_slop_bug_reports/

[13] HackerOne, “9th Annual Hacker-Powered Security Report,” HackerOne Press Release, October 1, 2025. https://www.hackerone.com/press-release/hackerone-report-finds-210-spike-ai-vulnerability-reports-amid-rise-ai-autonomy

[14] CVE Editorial Board, “CVE Board Meeting Minutes — October 29, 2025,” CVE Editorial Board Mailing List, 2025. http://www.mail-archive.com/[email protected]/msg00303.html

[15] Christopher Robinson (OpenSSF CTO), quoted in Axios, “AI Agents Spam the Volunteers Securing Open Source Software,” March 10, 2026. https://www.axios.com/2026/03/10/ai-agents-spam-the-volunteers-securing-open-source-software

[16] Jeff Geerling, “AI Is Destroying Open Source,” Jeff Geerling Blog, 2026. https://www.jeffgeerling.com/blog/2026/ai-is-destroying-open-source/

[17] OpenSSF, “OpenSSF at Black Hat USA 2025 / DEF CON 33 AI Cyber Challenge Highlights,” OpenSSF Blog, August 14, 2025. https://openssf.org/blog/2025/08/14/openssf-at-black-hat-usa-2025-def-con-33-aixcc-highlights-big-wins-and-the-future-of-securing-open-source/

[18] Cloud Security Alliance AI Safety Initiative, “ENISA Designated as EU CVE Root: Implications for NIS2 Compliance and Cross-Border Vulnerability Disclosure,” CSA Research Note, March 10, 2026. [CSA corpus reference]

[19] Cloud Security Alliance, “Top Concerns with Vulnerability Data,” CSA Research Report, 2024. [CSA corpus reference: data/source/official_csa_research/summaries/top-concerns-with-vulnerability-data-summary.md]

[20] Cloud Security Alliance, “A Benchmark Study of AI Agents in the SOC,” CSA Research Report, 2025. [CSA corpus reference: data/source/official_csa_research/summaries/a-benchmark-study-of-ai-agents-in-the-soc-summary.md]

← Back to Research Index