Advanced Search

0-Click in ChatGPT.

Fixxx

Moderator
Judge
Elite
Ultimate
Legend
Joined
31.10.19
Messages
1,178
Reaction score
3,199
Points
113
1762653289259.png
Hackers can poison the memory of ChatGPT and steal data between sessions.

There was discovered seven new vulnerabilities and exploitation techniques in ChatGPT that allow extracting private user data, bypassing protections and maintaining access between sessions. These issues are related to indirect instruction injections, bypassing link verification mechanisms and methods of forcing the model to leak information over the long term. Most demonstrations were implemented on the current GPT-5 model (or earlier versions) and attack scenarios cover simple user actions, such as regular search query. The exploited mechanisms are based on the weakness of input content processing by language models, known as prompt injection. An attacker places instructions in the data that the model processes when working with web pages or indexed content, after which the LLM can deviate from the original task and execute a foreign command. Here are seven techniques and vulnerabilities: indirect injection in the browsing context, "zero-click" through indexed content in search results, vulnerability in forming a query through the q parameter in the URL, bypassing the url_safe mechanism, conversation injection, hiding malicious content in rendering and the mechanism of injecting into long-term memory. All these techniques demonstrate both individual dangers and combinations that provide a complete scenario of compromise.
  1. The first discovered problem allows injecting instructions through comments on trusted sites: when requesting a summary of the material, the model initiates viewing the page and processes the content of other people's comments, so a specially formatted entry can turn a safe review into a command to disclose information.
  2. The second technique, "zero-click", shows that it's enough for a malicious resource to be in the search engine indices and during a regular user query, the LLM can refer to it and receive an injection without additional action from the victim. Researchers created sites with targeted topic names and learned to show malicious instructions only for the search subsystem, which led to successful PoC in real conditions.
  3. The third path is a simple substitution of a query through a special parameter in the address bar, which OpenAI allowed to accept as a ready-made prompt; clicking on such a link turns the user into a victim of injection, as the q-parameter is automatically substituted into the model's query.
  4. The fourth vector uses bypassing the url_safe link check. Since the bing.com domain is on the whitelist, search results wrapped in Bing tracking links passed the check and were rendered completely. Researchers showed how, using a set of indexed pages, you can extract any string letter by letter through sequential output of "safe" links and thus exfiltrate data.
  5. The fifth technique is called Conversation Injection - a chain in which the response of the auxiliary search system (SearchGPT) includes a prompt for the main model and ChatGPT, viewing the conversation history, perceives it as part of the context and follows malicious instructions. Such a scenario turns the limitations of the lightweight browser into an implicit path to controlling the main agent.
  6. The sixth trick uses a markdown rendering bug: part of the text on the same line as the opening marker of the code block is not displayed to the user in the interface but remains available for internal processing by the model. Researchers demonstrate how a malicious fragment can be hidden in a seemingly innocent visible response and unnoticeably prompt the model to perform unwanted actions.
  7. The seventh and most dangerous technique is injecting into long-term memory. Tenable showed that through a carefully crafted response, SearchGPT can prompt the main system to update the biographical memory, and then malicious instructions become a permanent part of the context, affecting responses in future sessions and creating a persistent data leak channel.
Combining these techniques resulted in several full-fledged PoC: phishing campaigns where a malicious link appeared in the summary response and prompted the user to navigate to an external resource; hidden comments on popular blogs through which constant compromise occurred; indexed sites providing "zero-click" for mass attacks and scenarios with long-term injection where the victim's information becomes a regular source of leaks with each new query.
 
Top Bottom