7 min read
The GCG Attack: Three Years Later, We Still Haven't Solved It
In 2023, a single paper broke the safety alignment of every major LLM. Three years and dozens of defenses later, the core problem remains unsolved. Here's what happened.
adversarial MLAI safetyLLM security