The OWASP Top 10 for LLMs: What Actually Matters

I’ve built RAG pipelines that serve thousands of users. I’ve shipped multi-agent systems in enterprise. I’ve set up the MLOps infrastructure that keeps these things running. And through all of that, I’ve watched security be treated as something we’ll “figure out later.”

OWASP’s Top 10 for LLM Applications landed in 2023 as a first attempt to catalog what can go wrong. The 2025 update just came out, and it’s a significantly different list. Four new entries, several reshuffled priorities, and a clear shift in framing - from “what can go wrong with a chatbot” to “what can go wrong with an AI system that has real agency.”

I’m not going to walk through all ten in the same order OWASP lists them. Instead, here’s how I’d group them based on what I’ve seen actually matter.

The ones that will burn you

Prompt Injection (LLM01)

Still #1. Still unsolved. The NCSC put out a warning in December 2025 saying this “may never be fully fixed,” and that tracks with everything I’ve observed.

The problem is fundamental: LLMs can’t tell the difference between instructions and data. When your agent reads a webpage, processes an email, or searches a knowledge base, any text in that content can hijack its behavior. Direct injection - someone typing “ignore your instructions” into a chatbot - is easy enough to filter. Indirect injection is the real threat. An attacker plants hidden instructions in a document your RAG pipeline indexes, or in a webpage your agent browses, and the model follows them because it doesn’t know the difference.

In 2025, a banking assistant was exploited through indirect prompt injection. Attackers sent crafted messages through the chat interface that tricked the AI into bypassing transaction verification. The company lost $250,000 before anyone noticed. This wasn’t a lab experiment. It was production.

If you’re building anything where an LLM processes content it didn’t generate, prompt injection is your biggest threat. Period.

Excessive Agency (LLM06)

This got the biggest expansion in the 2025 update, and it deserved it. When I was building multi-agent systems, the temptation was always to give agents more tools, more access, more autonomy - because it made them more useful. But every tool you hand an agent is a tool an attacker can use through that agent.

The GitHub Copilot vulnerability (CVE-2025-53773) is the example everyone should study. An attacker plants prompt injection in a public repository’s code comments. Developer opens the repo with Copilot active. The injected prompt modifies Copilot’s settings to enable auto-execution mode. Suddenly the attacker has arbitrary code execution on the developer’s machine. All through a code comment in a public repo.

The principle of least privilege isn’t new. But applying it to AI agents requires a different mindset than applying it to microservices. An agent that can send emails, write files, and execute code isn’t a productivity tool - it’s an attack surface with a chat interface.

Improper Output Handling (LLM05)

This is the one that bridges LLM security back to the security fundamentals everyone already knows. When an LLM generates output and you pass it to another system without validation, you get SQL injection, XSS, command injection - all the classics. The LLM just becomes the attack vector instead of a user form.

Consider a Text2SQL application where the LLM hallucinates a query. Instead of DELETE FROM users WHERE id = 123, it generates DELETE FROM users. No WHERE clause. If the application executes it without validation, you’ve lost your users table because a language model got creative.

I’ve seen teams obsess over prompt injection defenses while piping raw LLM output straight into database queries. The input side gets all the attention. The output side gets none. This is the vulnerability that bites people who think LLM security is only about what goes into the model.

The ones you’re probably ignoring

System Prompt Leakage (LLM07)

New to the 2025 list, and it’s here because it turned out to be embarrassingly easy. Bing Chat’s system prompt leaked when users just asked for it. Custom GPTs in the OpenAI Store were leaking prompts within days of launch. Copilot’s internal instructions were extracted multiple times throughout 2024.

Here’s why this matters more than it sounds: teams put things in system prompts that don’t belong there. API keys, internal URLs, decision thresholds (“transaction limit is $5,000 per day”), filtering logic. When an attacker extracts your system prompt, they don’t just see your instructions - they get the blueprint for every guardrail you’ve built. And then they know exactly how to work around them.

The fix is architectural and it’s not complicated. Don’t put secrets in prompts. Credentials go in environment variables. Business logic goes in application code. Guardrails go in independent systems the model can’t override. The system prompt should contain behavioral instructions and nothing else. I’ve seen production systems where a single prompt extraction would expose database credentials. That’s not an LLM problem - that’s an engineering problem.

Vector and Embedding Weaknesses (LLM08)

This is entirely new to the list, and it exists because everyone started building RAG systems without thinking about the security of the retrieval layer. 53% of companies using LLMs chose RAG over fine-tuning. That’s a lot of vector databases holding proprietary data with access controls that were bolted on as an afterthought.

Having built RAG pipelines, I can tell you the default posture is usually: index everything, retrieve based on similarity, serve results to whoever asks. Tenant isolation? Maybe. Access control on document retrieval? Rarely at the vector store level. Audit logging on what was retrieved and served? Almost never.

The attacks here are real: embedding inversion to reconstruct original documents from vectors, cross-tenant leakage in multi-tenant deployments, and poisoned documents injected into the knowledge base to manipulate retrieval results. That last one is particularly nasty - an attacker publishes content that’s semantically similar to common queries, and your RAG pipeline starts retrieving it alongside legitimate sources.

If you’re running RAG in production, audit your vector store’s access model. Not tomorrow. Now.

Sensitive Information Disclosure (LLM02)

This jumped from #6 to #2. The reason is simple - LLMs now touch way more data than they did two years ago. RAG pipelines pull from internal knowledge bases. Fine-tuning uses proprietary datasets. Agents query databases directly. The model doesn’t “know” what’s confidential. It generates the most likely next token, and sometimes that token is someone’s API key.

The mitigation isn’t exciting but it’s essential: sanitize data before it enters the pipeline, enforce access controls at the retrieval layer not the model layer, and monitor outputs for sensitive patterns. If your RAG pipeline can retrieve a document, assume the model will quote it verbatim to anyone who asks the right question.

The ones that are old problems in new clothes

Supply Chain (LLM03)

Compromised npm packages but for AI. You download model weights from Hugging Face, use third-party APIs, fine-tune on datasets you didn’t curate. A model that aces your benchmarks can still contain a backdoor trigger that activates on specific inputs. OWASP recommends cryptographic signing and SBOM for your AI stack. The gap between that recommendation and what teams actually do is enormous.

Data and Model Poisoning (LLM04)

Used to be called “Training Data Poisoning.” The scope expanded because the pipeline expanded - poisoning can now happen during fine-tuning, embedding generation, or through compromised LoRA adapters. The most practical attack: an attacker publishes a blog post with false but plausible technical information, your RAG pipeline indexes it, and your internal AI assistant starts confidently repeating it to your engineers. No access to training infrastructure needed.

Misinformation (LLM09)

Hallucinations have always been an LLM problem. They made the Top 10 because the consequences caught up. A coding assistant hallucinated a package name, attackers registered the package with malware, developers installed it. Medical chatbots gave wrong diagnoses. Legal AI cited cases that don’t exist. The distinction: if a human reads the output and checks it, hallucination is annoying. If an automated system acts on it, hallucination is a vulnerability.

Unbounded Consumption (LLM10)

DDoS for the AI age. Someone floods your API with long inputs, triggers maximum-length responses, or hammers your endpoint from throwaway accounts. Your inference costs spike, legitimate users get throttled, you wake up to a bill you weren’t expecting. Rate limiting, token caps, per-user quotas. Standard ops.

What the 2025 list really tells us

The shift from 2023 to 2025 mirrors the shift in how we use LLMs. Chatbots had a small attack surface - text in, text out. Agents have tools, data access, autonomy, and persistent memory. Every capability is also an attack vector.

The other thing worth noticing: half this list is just traditional application security applied to LLM systems. Output validation, supply chain integrity, access control, resource management - these aren’t new concepts. The LLM-specific problems (prompt injection, system prompt leakage, embedding weaknesses) sit on top of fundamentals that have been understood for decades. Teams that skip the fundamentals because they’re focused on the “AI-specific” threats are building on sand.

The full OWASP Top 10 for LLM Applications 2025 is at genai.owasp.org.