From autocomplete to full-blown code generation, AI-powered development tools like Cursor are transforming the way software is built. They’re fast, intuitive, and trusted by some of the world’s most recognized brands, such as Samsung, Shopify, monday.com, US Foods, and more.
In our latest research, HiddenLayer’s security team demonstrates how attackers can exploit seemingly harmless files, like a GitHub README, to secretly take control of Cursor’s AI assistant. No malware. No downloads. Just cleverly hidden instructions that cause the assistant to run dangerous commands, steal credentials, or bypass user safeguards. All without the user ever knowing.
This Isn’t Just a Developer Problem
Consider this: monday.com, a Cursor customer, powers workflows for 60% of the Fortune 500. If an engineer unknowingly introduces malicious code into an internal tool, thanks to a compromised AI assistant, the ripple effect could reach far beyond a single team or product.
This is the new reality of AI in the software supply chain. The tools we trust to write secure code can themselves become vectors of compromise.
How the Attack Works
We’re not talking about traditional hacks. This threat comes from a technique called Indirect Prompt Injection, a way of slipping malicious instructions into documents, emails, or websites that AI systems interact with. When an AI reads those instructions, it follows them regardless of what the user asked it to do. In essence, text has become the payload and the malware exposing not only your model but anything reliant upon it (biz process, end user, transactions, ect) to harm and create financial impact.
In our demonstration, a developer clones a project from GitHub and asks Cursor to help set it up. Hidden in a comment block within the README? A silent prompt telling Cursor to search the developer’s system for API keys and send them to a remote server. Cursor complies. No warning. No permission requested.
In another example, we show how an attacker can chain together two “safe” tools, one to read sensitive files, another to secretly send them off, creating a powerful end-to-end exploit without tripping any alarms.
What’s at Stake
This kind of attack is stealthy, scalable, and increasingly easy to execute. It’s not about breaching firewalls, it’s about breaching trust.
AI agents are becoming integrated into everyday developer workflows. But without visibility or controls, we’re letting these systems make autonomous decisions with system-level access and very little accountability.
What You Can Do About It
The good news? There are solutions.
At HiddenLayer, we integrated our AI Detection and Response (AIDR) solution into Cursor’s assistant to detect and stop these attacks before they reach the model. Malicious prompt injections are blocked, and sensitive data stays secure. The assistant still works as intended, but now with guardrails.
We also responsibly disclosed all the vulnerabilities we uncovered to the Cursor team, who issued patches in their latest release (v1.3). We commend them for their responsiveness and coordination.
AI Is the Future of Development. Let’s Secure It.
As AI continues to reshape the way we build, operate, and scale software, we need to rethink what “secure by design” means in this new landscape. Securing AI protects not just the tool but everyone downstream.
If your developers are using AI to write code, it’s time to start thinking about how you’re securing the AI itself.