When the Agent has the Keys

AI is becoming a force multiplier in cybersecurity: an accelerator, not an authority. Notes from the 2026 AI Security Forum.

Jun 12, 2026

AI is becoming a force multiplier across cybersecurity. It accelerates vulnerability discovery, red-team workflows, bug-bounty submissions, triage, reporting, and defensive simulation. But it does not remove the need for expert judgment. The more AI speeds up security work, the more organisations need disciplined governance, validated evidence, clear accountability, and stronger cross-ecosystem coordination.

The practical message from the event was blunt: do not treat AI as an autonomous security authority. Treat it as an accelerator layered on top of mature security foundations.

Two communities, one problem

One thing the forum made unmistakable: cybersecurity and AI safety grew up as separate disciplines, with separate communities, and have now converged on a single problem without quite merging. The divide was structural. AI is, ultimately, an application; cybersecurity has long been the domain of network and systems engineers. Frontier models forced both into the same room, because securing an AI system now runs the whole length of the stack: from the network layer up to the model’s reasoning and the actions its agents take. Neither community closes that gap alone.

Why this matters now

AI compresses security timelines. Attackers and researchers move faster from reconnaissance to exploit hypotheses, payload variations, and reporting. Platforms are already seeing higher submission volumes and more AI-assisted findings. At the same time, organizations face a growing AI attack surface: models, prompts, agents, tools, connectors, datasets, guardrails, memories, APIs, and non-human identities.

This is a dual challenge. Security teams have to use AI to keep pace, while preventing the AI systems themselves from becoming the weak point. At its core it is a contest of velocity versus veracity: AI scales errors as fast as it scales insight, so speed without human validation produces noise, not assurance.

Collapsing attack windows

The single figure that ran through the whole forum: the time from a vulnerability becoming known to being exploited has fallen from roughly thirty days to under a day, and by some accounts from minutes to seconds. Speakers put the mean time to exploitation at under ten hours, noted that more vulnerabilities are now exploited before they are ever publicly disclosed, and reported submission volumes on major bug-bounty platforms roughly doubling since frontier models arrived, with nearly all top researchers now using AI in their workflow. The defensive horizon is narrow and explicit: twelve to eighteen months to harden systems before open-source models reach today’s proprietary-frontier capability, after which the same power is in everyone’s hands, attackers included.

Detection is no longer the bottleneck

Here is the twist. The same models that collapsed the attack window also made finding vulnerabilities cheap and abundant, so the constraint moved downstream. The platforms reported that the hard part is no longer detection but triage, validation, and remediation: telling real findings from plausible-looking noise, and acting on them before the window closes. AI multiplies competence and incompetence alike; a weak submission dressed up to look real still has to be read, ranked, and ruled out by someone. The bottleneck is now human attention and orchestration, which is precisely where speed stops helping and judgment starts.

The stronger the model, the bigger the defense bill

The model behind all of this has a name: Mythos. Anthropic published its 244-page system card in April 2026, restricted the preview to five launch partners, and shipped the locked-down public version, Fable 5, in June. The card documents a model that saturates cybersecurity benchmarks and exploits browser zero-days almost at will. That capability cuts both ways, and the defensive side carries a cost the market underrates. The defense blueprints shown at the forum were multi-agent: a dozen specialized AI agents, from threat-modeling to pen-testing to secrets-detection, orchestrated on top of existing firewalls, SIEMs, and identity systems. None of it runs for free. Every agent consumes tokens and compute, continuously, so the stronger the attacker’s model becomes, the more agents defenders run and the more they pay simply to keep pace. Frontier AI does not only raise the capability bar; it raises the operating bill, and defenders inherit a compute cost that scales with the offense.

The risks leaders are underrating

False confidence. AI-generated findings can sound technically credible while being wrong, incomplete, or non-reproducible.
Operational overload. More AI-assisted reports and scans can overwhelm triage, remediation, and disclosure.
The agentic attack surface. With agents you are no longer only protecting data — you are protecting agency. An agent with tool access is a high-privilege identity with a probabilistic core: it calls tools, touches external systems, retains context, and takes actions, raising the risk of prompt injection, authorization failures, data leakage, tool abuse, and cost exhaustion. If you cannot audit its reasoning or constrain its tool-calls, the attack surface is large and unpredictable.
Governance gaps. Existing standards remain useful, but many AI-specific baselines, disclosure norms, testing requirements, and scoring methods are still immature.
Accountability ambiguity. Multi-vendor AI systems make it harder to assign responsibility when failures involve models, data, tools, integrations, or downstream decisions.

What to do about it

Strengthen foundations before scaling AI. Prioritize asset inventory, vulnerability management, incident response, secure-development controls, change management, data governance, and auditability.
Extend asset management to AI systems. Track models, datasets, prompts, system instructions, tools, agents, memory stores, pipelines, guardrails, connectors, APIs, and non-human identities.
Keep humans accountable. Require expert validation for AI-assisted findings, exploitation decisions, production changes, risk acceptance, and external disclosure.
Use AI defensively where it is strongest. Summarization, triage support, detection enrichment, report drafting, scenario generation, code-review assistance, prioritization, always with evidence and reproducibility.
Treat explainability as sensitive. Explanations aid compliance and debugging, but can also leak model logic or attack intelligence. Classify and restrict how they are shared.
Prepare for agent security. Build controls around identity, permissions, tool use, memory, logging, rate limits, cost, policy enforcement, and post-action review.
Update vulnerability-disclosure workflows. Expect higher report volume and shorter remediation windows. Define intake-quality thresholds, prioritization rules, escalation paths, and customer/regulator communication.
Coordinate beyond the enterprise. Work with vendors, customers, researchers, platforms, regulators, and standards bodies on shared baselines for AI security testing, disclosure, incident reporting, and remediation.

What policymakers should prioritize

Focus less on abstract AI-safety language and more on operational requirements that can actually be tested: AI asset inventories, audit trails, evidence standards, incident reporting, human oversight, model and agent access controls, vulnerability-disclosure rules, and security baselines for high-risk AI systems.

Standards also need to answer AI-specific scoring and accountability questions: how to rate model extraction, prompt injection, tool misuse, data leakage, agent compromise, and multi-vendor failure.

Oversight also needs inspection power of its own. In early 2026 Germany set up its national AI supervision under the EU AI Act: the KI-MIG act makes the Bundesnetzagentur the central market-surveillance authority, with a coordination centre (KoKIVO) on top. That launch reopened a basic question: is coordination plus enforcement enough? It is not. A regulator that can convene stakeholders and levy fines but cannot itself inspect (test the models, probe the agent harness, examine the underlying infrastructure) is governing one step removed from the system it is meant to secure, simply because technology always advances faster than regulation. The same question ran through the forum: can European bodies actually access and assess frontier models end-to-end, or only regulate them on paper? Real inspection capability (evaluation labs, methods, technical staff, and access) is the precondition that makes the rest of the framework credible.

Six questions every executive should be asking

Which AI systems and agents are already in use across the organization?
Who owns their security, compliance, and incident response?
What evidence is required before an AI-assisted finding becomes actionable?
How are AI prompts, tools, datasets, and outputs logged and governed?
Can current vulnerability management handle AI-driven report volume?
What happens if an AI agent takes an unauthorised or harmful action?

The bottom line

Adopt AI in cybersecurity deliberately, not defensively, not theatrically. Use it to accelerate skilled teams, improve visibility, and cut manual burden. But keep final judgment, risk ownership, and operational authority with accountable humans. The winners will be the organizations that pair AI speed with mature security operations, disciplined governance, and credible evidence.

Want the argument in a single move? Open the latest frontier model in your terminal and ask it to audit your device for security vulnerabilities, reasoning like a CISO. The speed and quality of what comes back makes the accelerator point better than any slide — and the moment it proposes fixes is exactly where the discipline bites: review before you apply.

This piece draws on the AI Security Forum 2026, held on 11 June 2026 at Sapanca Lake, Turkey, and hosted by the Global Digital Foundation. The forum convened an international group of cybersecurity and AI-security experts from across the value chain: from the network layer to applications, from defensive to offensive cybersecurity.

Thorsten Jelinek’s Substack on Digital Governance

Discussion about this post

Ready for more?