AI Agents Still Can't Stop Prompt Injection Attacks, Researchers Warn

Summary

New research shows AI web agents remain highly vulnerable to prompt injection attacks. A study by researchers from Nanyang Technological University, ST Engineering, IBM Research, and the University of Illinois Urbana-Champaign found that none of the tested agents consistently resisted attacks. They built StakeBench to evaluate prompt injection in realistic online settings and to measure how harm varies by victim, task alignment, environmental cues, and when exposure occurs during execution. Across 3,168 simulations using NanoBrowser and BrowserUse with GPT-5 and Gemini 2.5-Flash, direct prompt injections succeeded over 79% of the time in all configurations, while indirect attacks succeeded 41.67% to 68.16% of the time. The study also identified “stealthy parasitism,” where an agent completes the user’s task while secretly advancing an attacker’s goal, such as nudging product recommendations. The findings show prompt-injection risk depends on context and stakeholder impact, not just model quality.