ARTICLE
The Emergence of Package Hallucination Attacks
security
AI & innovation
When AI generates code referencing packages that don't exist, attackers can register those names with malicious software. Our Cyber Security Specialist, Marcello Carboni, explains how this emerging supply chain attack works and what teams can do to protect against it.
June 06, 2025
A novel cyberattack technique is emerging - one that exploits Large Language Models (LLMs) hallucinations, using them as an attack vector to install malware into developer's systems.
But let's take a step back - the basics of this attack resemble a technique known as dependency confusion, where an attacker creates a malicious package with a similar name to a legitimate one - say, a python package named 'panda' instead of 'pandas'. This simple typo might lead to a developer installing and executing malware on their machine.
Now, the same principle can be applied to AI-generated code.
LLMs generate code by predicting the most likely next token based on patterns in their training data. This can result in "hallucinating" a response or part of it, which can appear plausible but is often incorrect.
According to research conducted by Vulcan, hallucinated package names tend to be surprisingly common - around 20% of Node.js code snippets generated using ChatGPT contained an unregistered package, while for Python that number jumps up to 35%.
An attacker can use these hallucinations to their advantage - even if they are often not consistent between responses - by running thousands of ChatGPT queries and registering all non-existing packages they find, or by concentrating their effort into finding a cluster of low-entropy tokens that might be hallucinated with a higher probability.
This opens the door to a novel form of supply chain attack, exploiting the trust developers place in the AI's responses.
The attack chain is quite simple, and looks like this:
- An attacker finds a reference to a non-existent package.
- The attacker registers the package with malicious software.
- Developers run the generated code snippets, importing and executing the malicious package.
Mitigation Strategies
- Education: Teams should be made aware of the potential risk of LLM hallucinations and should always check the generated responses for false information
- Pipeline Scanning: CI/CD pipelines should include automated vulnerability and configuration scanning
- Strict Dependency Management: Have a system in place to assess and approve packages before their use