Redazione RHC : 29 August 2025 10:21
Anthropic has raised the alarm about a new threat related to “smart” browser extensions: Websites can infiltrate hidden commands that an AI agent will execute without thinking. Anthropic has released a research version of the Claude extension for Chrome and simultaneously published internal test results: When run in a browser, models are susceptible to command injections in 23.6% of unprotected test cases. This data has sparked a debate about the safety of integrating autonomous AI agents into web browsers.
The extension opens a sidebar with constant context of what’s happening in tabs and, on demand, provides access to specific actions, from recording meetings to sending responses, from preparing expense reports to controlling website functions. User-side access is governed by permissions, and the new product is being released in preview only to a thousand subscribers to the Max plan, which costs between $100 and $200 a month; everyone else is on a waiting list.
The project builds on the Computer Use feature launched in October 2024. Back then, Claude could take screenshots and literally move the cursor for a person; The integration has now gone deeper: the agent runs directly within Chrome, without simulating external clicks.
The security checks covered 123 cases grouped into 29 attack scenarios. Without further constraints, the models succumbed to the embedded instructions in 23.6% of attempts. In one example, a malicious email urged the assistant to delete incoming messages “for inbox cleanup purposes”, and without constraints, the agent actually deleted the messages without providing any explanation.
To reduce risk, Anthropic has added several layers of protection. The user can grant and revoke access to specific sites, the agent requires confirmation before posting, purchasing, or transferring personal data, and categories such as financial services, adult content, and sites with pirated material are closed by default. In repeated tests, the success rate for offline attacks dropped to 11.2%, and in a separate series of four browser-only techniques, the new logic reduced the result from 35.7% to 0.
Independent developer Simon Willisson rated the remaining 11.2% as unacceptably high risk and believes the very idea of an agent browser extension is inherently vulnerable. According to the specialist, without absolutely reliable barriers, such an approach will inevitably lead to abuse.
The concerns are supported by competitors’ experience. Brave’s security team recently demonstrated that Perplexity’s Comet browser could be tricked into performing unauthorized actions by hiding instructions in Reddit posts. When a user asked the agent to repeat the discussion, it opened Gmail in a separate tab, extracted the address, and initiated access recovery procedures. Perplexity’s attempt to patch the flaw was unsuccessful; Brave reportedly managed to bypass the proposed measures.
Anthropic intends to use limited previews to collect real-world attack patterns and refine the protection before it becomes widely available. However, at the current level of maturity, the risks are effectively transferred to the user, who uses such an open web assistant at their own risk. Willisson notes that expecting people to competently evaluate all threats in such a dynamic model is unrealistic, so the security issue should be addressed by the vendors themselves before the product is made public.