A newly discovered vulnerability in Gmail’s Gemini-powered AI features has raised concerns about the potential for AI-assisted phishing attacks.
Highlights
- Critical Vulnerability Found: Security researcher Marco Figueroa exposed a prompt injection flaw in Gmail’s Gemini AI that allows invisible instructions to be embedded in emails—potentially manipulating AI-generated summaries.
- Hidden Prompts in Emails: Attackers can hide prompts using white-on-white text, zero font sizes, or off-screen CSS. While invisible to users, these prompts can be read and executed by Gemini, creating misleading summaries.
- AI Summaries as Attack Vectors: Unlike traditional phishing, this technique hijacks the AI’s authority by injecting malicious commands into what appears to be a neutral, AI-generated summary—raising the risk of user compliance.
- Google’s Response: Google acknowledged the issue and is rolling out layered defenses including:
- Prompt injection classifiers
- Reinforcement learning against harmful prompts
- Markdown sanitization and suspicious URL redaction
- User warnings and confirmation prompts
- Regulatory Implications: The EU AI Act may classify such deceptive AI behaviors as “high-risk,” which could require Google to implement stricter safety, transparency, and audit protocols for Gemini.
- Security Best Practices: Experts advise treating AI summaries as assistive—not definitive—tools. Users should:
- Be wary of urgent prompts from AI
- Manually verify suspicious emails
- Watch for hidden formatting that may hide instructions
Security researcher Marco Figueroa, who leads Mozilla’s GenAI Bug Bounty Programs, demonstrated how prompt injection techniques could be used to manipulate Gemini into generating misleading or harmful summaries—without the user realizing it.
How the Attack Works?
The exploit relies on indirect prompt injection, where malicious instructions are embedded in an email using invisible formatting,
- White text on a white background
- Font size set to zero
- Off-screen CSS positioning
While these instructions remain invisible to human readers, Gemini’s summarization feature can still interpret them. In tests, Gemini reproduced malicious directives embedded in email content, presenting them as part of a legitimate summary.
Because the output comes from Google’s AI system—viewed by many users as neutral or trustworthy—the likelihood of user compliance increases significantly.
AI Summaries as Attack Vectors
What makes this tactic particularly concerning is that it doesn’t rely on traditional phishing indicators such as suspicious links or attachments.
Instead, it exploits how large language models prioritize and respond to content, particularly when presented in formats designed to mimic admin-level instructions.
In one example, Gemini included a hidden command in its summary that urged users to take a specific, potentially harmful action—despite no such instruction appearing in the visible email.
Figueroa noted that wrapping injected content in authoritative-sounding language increased the chance of the model acting on it.
Google’s Response
Google confirmed it had not observed this attack being used in real-world scenarios but acknowledged the significance of the issue. The company stated it is working on mitigations but did not provide a specific timeline or technical details.
In recent updates, Google shared a multi-layered defense strategy to address prompt injection vulnerabilities:
- Prompt injection classifiers to detect and block hidden commands
- Reinforcement training to steer Gemini away from executing suspicious content
- Sanitization of markdown and redaction of suspicious URLs
- User-facing confirmation prompts and threat notifications
These measures are being gradually deployed to reduce the likelihood and impact of prompt injection attacks within Gmail and other Gemini-integrated products.
EU AI Act Implications
According to cybersecurity firm 0DIN, the attack method may soon fall under new regulatory scrutiny. The EU AI Act, currently in draft form, classifies deceptive AI outputs that manipulate user behavior as “high-risk” use cases under Annex III.
If enforced, this could require Google to implement stricter testing, transparency, and audit processes for AI-powered features like Gemini summaries.
Security Guidance
Cybersecurity professionals caution that users should treat AI-generated summaries as assistive tools, not authoritative sources. Platforms using AI to summarize content—especially in email—should:
- Train users to be cautious of AI-generated prompts suggesting urgent actions
- Flag or quarantine messages containing suspicious formatting (e.g., hidden text)
- Encourage manual review of original emails before acting on AI interpretations
As noted by Lifewire, malicious actors could use such vulnerabilities to insert fake alerts (e.g., “Click here immediately” or “Call this number”), leveraging the AI’s voice of authority to bypass user skepticism.