The community of security professionals known as the Open Web Application Security Project (OWASP) has been publishing and updating the OWASP Top 10 Web Application Security Risks for a while, and now they’ve added the Top Ten for Large Language Models (LLMs).
Its version 1.1 and its still in its very early iterations. As more attacks against LLMs and AI applications happen, this list will surely evolve.
TLDR:
- 500+ Security Researchers developed this List
- 4 are New Vulnerabilities that are specific to AI
- 3 Are existing Vulnerabilities with a new AI specific application
- 3 are Existing Vulnerabilities that are very applicable to LLMs.
- NoCode AI Exploits; Some of these exploits can be written in plain English
- A few of these Vulnerabilities have already been Exploited
LLM01: Prompt Injection - New
Injection attacks have been around for a long time. While LLMs do not use databases, but rather use an abstracted DB which eliminates SQL injection attacks, there is still a user interface that we could use to exploit traditional app sec vulnerabilities, like overflows, memory corruption, etc.
OR the entirely new vulnerability.
We could manipulate a LLM via inputting specially engineered prompts that could exploit logic flaws and trick the model into doing things it's not supposed to. This could potentially trick it into visiting a malicious website, executing code, disclosing sensitive info, violating its own terms of service. We have never used ChatGPT to write exploits. There's a ton of possibilities, which is why prompt injection is #1 on the list. This can pretty much be done without any coding, unlike traditional application injection vulnerabilities.
Kevin Liu, a Stanford University student discovered he could use a prompt injection technique to instruct Bing Chat to ignore previous instructions and reveal information typically hidden from users.
Inputs need to be validated and filtered.
LLM02: Insecure Output Handling - New
This one is similar to other traditional application security issues. The Insecure Output Handling vulnerability is caused by the downstream component blindly accepting the LLM’s output. For example if I tell chatGPT to write code, then it automatically executes its output (the code it wrote) on the client or server side, or the output is too big and overflows, there's a lot of possibilities which is why its #2.
This can lead to Cross Site Scripting and Cross Site Request Forgery in web browsers as well as Server Side Request Forgery, privilege escalation, or remote code execution on the application’s backend systems.
Outputs need to be filtered and validated, and other common application security standards are applicable.
LLM03: Training Data Poisoning - New
This one is interesting. We actually wrote an entire blog post breaking it down. But this basically involves injecting small amounts of maliciously crafted data into the data sets that are used to train the model. In experiments it's been proven successful against just about every type of algorithm that's used to train models.
Medium Severity Attack: This can accomplish denial of service conditions, like when researchers conducted a black box attack against Google Auto ML. They poisoned a fifth of a percent (.2%) of the training data, in a way undetectable to the human eye. The attack changed the google auto machine learning’s results from 82% accuracy to 69% error.
Real Life High Severity Attack: The more concerning objective is engineering a backdoor. There were at least 4 large-scale data poisoning attacks on Gmail’s spam filter between 2017 and 2018. Attackers sent millions of specially crafted emails designed to throw off the classifier and change its definition of a spam email. This allowed attackers to send malicious emails without being detected.
The most likely scenario is the models are trained via interactions or live data on the internet.
LLM04: Model Denial of Service - Old
This isn’t anything new; pretty much the same as any DoS/DDoS. Attackers could overload a LLM application, like say ChatGPT with prompts, especially ones that require more computation, to amplify their DoS attack.
OpenAI has already attributed multiple ChatGPT outages due to DDoS Attacks.
Bot Detectors, Captcha test, DDoS detectors, etc are all needed.
LLM05: Supply Chain Vulnerabilities - Old with a New AI Specific Application
Similar to traditional software development where developers reuse code and/or SDKs from repositories, developers often re-use pre-trained models and publicly available data sets; usually from repositories.
Developers could unknowingly pull poisoned datasets and use them to train the models, as we mentioned above in data poisoning.
Or they could acquire poisoned models from repositories and either fine tune them, or combine them with other models to develop a neural network. The latter is called a transfer learning attack, in experiments researchers have successfully implanted backdoor logic in self-driving cars and antiviruses among others.
LLM06: Sensitive Information Disclosure - Old with a New AI Specific Application
LLMs could be trained on data sets that contain sensitive information, proprietary information, or confidential data. Once trained that is basically baked into the LLM, in a sort of abstracted database, so sometimes they remember the specific info. It's also possible they could look up sensitive information, say like ChatGPT 4 responding to a prompt.
Note: LLMs don’t use your traditional application databases.
LLMs are neural networks, which means it's a bunch of logic stacked on top of one another, so it's foreseeable that there will be errors that could lead to data exposure. These LLMS could disclose this information without proper output filtering or by an attacker using a prompt injection to exploit flaws in security logic, like the output filtering.
LLM07: Insecure Plugin Design - Old
This is pretty much a blanket term for traditional application security vulnerabilities in any plugins, interfaces, etc for the LLMs. This could be authentication issues or or lack of input validation that leads to data exfiltration, privilege escalation, remote code execution, etc.
LLM08: Excessive Agency - Old with a New AI Specific Application
AI could take over the world if too much functionality, excessive permissions, or too much autonomy is programmed into the LLM.
Ok, just kidding (maybe), but too much functionality, excessive permissions, or too much autonomy increases the risk of errors, vulnerabilities, or misuse.
For example a chatbot designed for writing literature probably doesn't need to know how to code, or someone could use it to write malware. This vulnerability, like unnecessary read/write permissions, could also be chained with one of the other vulnerabilities in this list.
LLM09: Over Reliance - New, But Obvious.
This is pretty simple; the AI may not always give accurate outputs, so their outputs need to be validated before relying on it for anything critical.
LLM10: Model Theft - Old
This is really nothing new; intellectual property can be stolen, now the intellectual property is just stored in new places.
Insiders steal intellectual property, corporate espionage happens, Chinese state sponsored hackers have been targeting intellectual property for years, ransomware groups can hold information hostage. The model is now another target on the list.
Get Your LLM tested against the OWASP LLM Top 10.
Request your consultation with a pentester today.