WebPromptTrap β New Indirect Prompt Injection Vulnerability in BrowserOS
|
Listen to post:
Getting your Trinity Audio player ready...
|
Executive Summary
Cato researchers have discovered a new indirect prompt injection exploit pattern workflow in BrowserOS (an open-source agentic AI browser). We named it βWebPromptTrapβ because the prompt originates from untrusted web content and it traps users into approving an authorization step through a trusted-looking AI summary.
The flaw was identified in BrowserOS Agent (Chat Mode), a productivity assistant and in this blog we demonstrate it end-to-end with a proof-of-concept (PoC), where a threat actor manipulates an AI summary to steer a GitHub authorization flow, gains access to the developerβs repositories, and impact parts of the organizationβs development lifecycle.
The PoC focuses on a developer, but the same pattern can target enterprise roles across any industry. A CFO summarizing a finance article may be guided to connect a βtrusted toolβ to the organizationβs enterprise resource planning (ERP) or expense system. An HR manager may be led to authorize access to the HR and payroll platform. A sales ops leader may be prompted to connect the CRM. A support or IT admin may be pushed to approve access to ticketing and service management. In every case, hidden instructions in a webpage can shape the AI summary into a persuasive call to action and a legitimate-looking link, leading users into routine authorization steps that can hand attackers tokens and real access to sensitive SaaS data and actions.
Timeline & Disclosure
- Vulnerability found in: BrowserOS version 0.29.0
- Affected versions: BrowserOS versions 0.30.0 and earlier
- November 13, 2025: Disclosed to the BrowserOS team via Discord DM by the Cato Application Security team.
- December 6, 2025: Submitted a CVE request to MITRE (pending).
- March 24, 2026: At the time of publishing, MITRE has not responded to Cato Networks.
- Fixed version: BrowserOS version 0.32.0
We would like to thank the BrowserOS team for their prompt and professional response throughout the responsible disclosure process, and for addressing the issue with a fix in BrowserOS version 0.32.0.
Technical Overview
Vulnerability
- Name: Indirect Prompt Injection (WebPromptTrap)
- Component: BrowserOS Agent, Chat Mode (page summarization)
- Attack surface: Webpage content ingested for summarization, including content that is not visible to users
How WebPromptTrap Works?
Indirect prompt injection is when an LLM-based assistant consumes untrusted webpage content and mistakenly treats embedded instructions as guidance, resulting in threat actor-controlled output. In WebPromptTrap, the trick is simple: the threat actor hides instructions in the page using βinvisibleβ HTML/CSS (off-screen text, low opacity), which the agent can still ingest during summarization.
End-to-End Flow
- Threat actor publishes a legitimate-looking article with hidden instructions.
- Victim clicks βSummarizeβ in BrowserOS Chat Mode.
- The summary is manipulated to include a persuasive recommendation and an external link.
- Victim clicks, reaches a legitimate-looking authorization page, and approves access.
- Threat actor receives a token (or equivalent) and gains access to account data and actions.
PoC
In the following PoC, we created a legitimate-looking blog post that contains hidden instructions designed to manipulate the AIβs response. These instructions are not visible to the user during normal reading, but they are still present in the page content that the agent can ingest during summarization (see Figure 1).
Figure 1 shows the injected hidden instruction block embedded inside the blog post HTML, alongside otherwise legitimate content.
Figure 1. Hidden prompt injection in webpage content
Instead of returning a neutral summary, the AI output is steered to include a persuasive call-to-action (CTA) and a link that appears endorsed by the AI assistant (see Figure 2).
Figure 2 shows the resulting AI-generated summary, including the threat actor-directed βbottom lineβ message and the link to a threat actor-controlled destination.
Figure 2. Manipulated AI summary output
Observed effect: Instead of a neutral summary, the AI assistant can be nudged to output a persuasive βbottom lineβ CTA with a clickable link. That creates the trust bridge that makes the next step (authorization) feel safe.
Important: The OAuth token theft path is only one example. The same pattern can pivot into credential phishing, malware lures, misinformation, or cross-tab social engineering, depending on what the AI is allowed to output or do.
Video demo
In the video below, we demonstrate how hidden webpage instructions can influence BrowserOS Chat Mode summarization and produce a threat actor-controlled link that leads into an authorization flow.
Security Best Practices
Reducing Risk in AI Browsers
Indirect prompt injection is a broader class of risk that arises from allowing models to ingest untrusted web content. Today, there is no single, definitive mitigation that eliminates it completely in all cases. AI browsers should therefore treat this as an ongoing security problem and apply layered, compensating controls that reduce the likelihood of manipulation and limit impact when it occurs.
- Treat webpage content as untrusted
- Separate trusted user commands from untrusted page content using strict delimiters and data marking.
- Explicitly instruct the LLM that webpage text is untrusted data and must not be treated as instructions.
- Filter and minimize hidden or suspicious content
- Detect patterns such as extreme opacity, off-screen text blocks, tiny font tricks, and instruction-like payloads.
- Prefer visible text for summarization where possible, while handling legitimate accessibility content carefully.
- Constrain summary outputs
- For βSummarizeβ actions, consider a summary-only mode:
- No clickable links by default, or;
- Only same-origin links, or;
- Links shown as plain text plus a safety interstitial.
- For βSummarizeβ actions, consider a summary-only mode:
- Add friction before risky actions
- If the AI assistant suggests an external link, show a confirmation dialog that clearly labels it as AI-generated and untrusted.
- Display the destination domain clearly and warn about authorization-consent scams.
- Monitoring and detection
- Log summarization actions and outputs (privacy-aware) and detect anomalies such as CTA plus external link patterns.
- Alert when outputs resemble instruction patterns or include claims that do not match the page content.
No single control is sufficient. The goal is to make manipulation harder, make risky outputs safer by default, and reduce blast radius when a user is targeted.
Conclusion
WebPromptTrap is a simple idea with outsized consequences. If an AI browser summary can be steered, the userβs trust becomes the exploit path. The next click can be an authorization event, not just navigation.
BrowserOS handled this in the open-source spirit, disclosure, iteration, and a fixed release. The bigger takeaway remains: agentic AI browsing demands security boundaries that assume webpage content is hostile, and outputs that can trigger authorization must be treated as high risk by default.
Protections
Even with upstream fixes, enterprises should assume indirect prompt injection will recur across AI browsers and assistants. Defense-in-depth matters.
Network-Layer GenAI and MCP Controls
Enterprises can reduce risk by applying generative AI (genAI) security controls at the network layer through the Cato SASE Platform. These controls operate at the network layer and can apply to MCP-related traffic and AI tool activity across the enterprise.
With these capabilities, enterprises can define security rules that inspect and control AI tool activity.
Examples:
- Create policies that block or alert on high-risk remote tool calls (for example, create, add, edit).
- Enforce least privilege for AI-driven actions.
- Detect and alert on suspicious prompt usage in near real time.
- Maintain detailed audit logs of AI and MCP-related activity across the network.
By combining secure application configuration with network-layer genAI security controls, enterprises gain stronger protection against token theft, data exposure, and unauthorized AI tool execution.