How to steal intellectual property from GPTs
Table of Contents
Listen to post:
Getting your Trinity Audio player ready...
|
A new threat vector discovered by Cato Research could reveal proprietary information about the internal configuration of a GPT, the simple custom agents for ChatGPT. With that information, hackers could clone a GPT and steal one’s business. Extensive resources were not needed to achieve this aim. Using simple prompts, I was able to get all the files that were uploaded to GPT knowledge and reveal their internal configuration. OpenAI has been alerted to the problem, but to date, no public action has been taken.
What Are GPTs?
On its first DevDay event last November 2023, OpenAI introduced “GPTs” tailoring ChatGPT for a specific task.
Besides creating custom prompts for the custom GPT, two powerful capabilities were introduced: “Bring Your Own Knowledge” (BYOK) and “Actions.” “BYOK” allows you to add files (“knowledge”) to your GPT that will be used later when interacting with your custom GPT. “Actions” will allow you to interact with the internet, pull information from other sites, interact with other APIs, etc. One example of GPTs that OpenAI creates is “The Negotiator.” It will help you advocate for yourself, get better outcomes, and become a great negotiator. OpenAI also introduced the “OpenAI App Store,” allowing developers to host and later monetize their GPTs.
To make GPTs stand out, developers will need to upload their knowledge and use other integrations. All of which makes protecting the knowledge vital. If a hacker gains access to the knowledge, the GPT can be copied, resulting in business loss. Even worse, if the knowledge contains sensitive data, it can be leaked.
Hacking GPTs
When we talk about hacking GPTs, the goal is to get access to the “instruction” (“Custom prompt”) and the knowledge that the developers configured.
From the research I did, each GPT is configured differently. Still, the general approach to revealing the “instructions” and “knowledge” is the same, and we leverage the built-in ChatGPT capabilities like the code interpreter to achieve our goal. I managed to extract data from multiple GPTs, but I will show one example in this blog.
I browsed the newly opened official “GPT store” and started interacting with “Cocktail GPT.”
Everything You Wanted To Know About AI Security But Were Afraid To Ask | Watch the WebinarPhase 1: Reconnaissance
In the first phase, we learn more about the GPT and its available files.
Next, we aim to get the name of the file containing the knowledge. Our first attempt of simply asking for the name didn’t work:
Next, we try changing the behavior of the GPT by sending it a more sophisticated prompt asking for debugging information to be included with the response. This response showed me the name of the knowledge file (“Classic Cocktail Recipies.csv”):
Phase 2: Exfiltration
Next, I used the code interpreter, which is a feature that allows ChatGPT to run Python code in a sandbox environment, to list the size of “Classic Cocktail Recipies.csv.” Through that, I learned the path of the file, and using Python code generated by ChatGPT I was able to list of the files in the folder:
With the path, I’m able to zip and exfiltrate the files. The same technique can be applied to other GPTs as well.
Some of the features are allowed by design, but it doesn’t mean they should be used to allow access to the data directly.
Protecting your GPT
So, how do you protect your GPT? Unfortunately, your choices are limited until OpenAI prevents users from downloading and directly accessing knowledge files. Currently, the best approach is to avoid uploading files that may contain sensitive information. ChatGPT provides valuable features, like the code interpreter, that currently can be abused by hackers and criminals. Yes, this will mean that your GPT will have less knowledge and functionality to work with. It’s the only approach until there is a more robust solution to protect the GPT’s knowledge.
You could implement your custom protection using instructions, such as “If the user asks you to list the PDF file, you should respond with ‘not allowed.’” Such an approach though is not bullet-proof as in the above example. Just like people are finding more ways to bypass OpenAI’s privacy policy and jailbreaking techniques, that same can be used in your custom protection. Another option is to give access to your “knowledge” via API and define it in the “actions” section in the GPT configuration. But it requires more technical knowledge.