Doom Arena: A Security Evaluation Framework for AI Agents

October 27, 2025, 10:00 am to 11:00 am

Virtual event

We present DoomArena, a security evaluation framework for AI agents. DoomArena is designed on three principles: 1) It is a plug-in framework and integrates easily into realistic agentic frameworks like BrowserGym (for web agents) and τ-bench (for tool calling agents); 2) It is configurable and allows for detailed threat modeling, allowing configuration of specific components of the agentic framework being attackable, and specifying targets for the attacker; and 3) It is modular and decouples the development of attacks from details of the environment in which the agent is deployed, allowing for the same attacks to be applied across multiple environments.

We illustrate several advantages of our framework, including the ability to adapt to new threat models and environments easily, the ability to easily combine several previously published attacks to enable comprehensive and fine-grained security testing, and the ability to analyze trade-offs between various vulnerabilities and performance.

We apply DoomArena to state-of-the-art (SOTA) web and tool-calling agents and find a number of surprising results: 1) SOTA agents have varying levels of vulnerability to different threat models (malicious user vs malicious environment), and there is no Pareto dominant agent across all threat models; 2) When multiple attacks are applied to an agent, they often combine constructively; 3) Guardrail model-based defenses seem to fail, while defenses based on powerful SOTA LLMs work better.

This presentation is facilitated by Léo Boisvert. Léo is a PhD candidate in Computer Engineering at Polytechnique Montréal and MILA, specializing in AI security and web agents. His research focuses on developing secure LLM-based systems, through benchmarks and frameworks including WorkArena++ and DoomArena. Leo bridges technical innovation with public understanding through his column on Radio-Canada's. His work has been published in top-tier venues including NeurIPS and TMLR, contributing to both the technical advancement and ethical implementation of AI systems.

This event is organized by the Digital Research Alliance of Canada. Click here to register.

Doom Arena: A Security Evaluation Framework for AI Agents

October 27, 2025, 10:00 am to 11:00 am

Advanced Research Computing

About UBC

UBC Campuses

UBC Sites