Following the launch of the AI browser Comet by Perplexity, experts began examining its security. Inspections, including those by Brave, revealed that such browsers are vulnerable to malicious requests from fraudsters, endangering user data privacy. OpenAI has also confirmed this.
The company, which recently released the ChatGPT Atlas browser, published a new blog post outlining the identified vulnerabilities and the actions taken to address them. OpenAI emphasizes that the implementation of malicious requests remains a significant security challenge for AI, requiring continuous enhancement of protective measures.
Malicious request implementations, or prompt injection, are a type of attack on AI agents in browsers, where harmful instructions are deliberately embedded into content. These can be hidden on websites, in emails, PDF files, or other materials processed by AI. The goal of such attacks is to force the model to alter its behavior and execute commands from the attacker instead of user requests.
Such attacks are particularly dangerous as they often do not require human involvement. Users may not even realize that the AI agent is secretly transmitting their personal data to fraudsters or performing other actions implanted by malicious actors, such as sending harmful emails.
To counter these attacks, OpenAI has created an "automated malicious actor based on LLM" – essentially an AI bot that simulates hacker actions and attempts to use prompt injection. Initially, this AI tests attacks in a separate simulator to observe how browser agents respond. By analyzing the results, the system iteratively modifies and improves its attacks to learn to detect them better in real-world conditions. The data obtained is later integrated into protective mechanisms.
OpenAI also showcased an example of prompt injection that its AI detected and used to bolster the security of ChatGPT Atlas. In this scenario, the attacker sent an email containing a hidden instruction for the AI agent – effectively a template for a resignation letter to the CEO. Later, when a user requested a message to be written for the CEO regarding their absence, the agent could have used this instruction to send a resignation letter. However, due to the training process, the system recognized that the instruction was a harmful prompt injection and did not execute it without explicit user confirmation.
"The nature of prompt injection makes deterministic security guarantees challenging, but through the scaling of our automated security research, competitive testing, and strengthening rapid response cycles, we can improve the model's resilience and security before anticipating a real attack," the company states in its blog.
Despite the introduction of new tools and security measures, prompt injection remains a serious threat to AI-based browsers. This has led some industry experts to question whether using such agent-based browsers is advisable, considering the risks to personal data.