ao link
Business Reporter
Business Reporter
Business Reporter
Search Business Report
My Account
Remember Login
My Account
Remember Login

Google Gemini: a virtual AI assistant with an evil twin

Sponsored by WithSecure
Linked InTwitterFacebook

One of the big promises of the latest AI boom is augmentation: if AI doesn’t take our jobs, it’ll make our jobs more productive.

 

AI assistants aim to make everyday office work easier, or at least help us do more in less time. AI assistants such as Google Gemini, OpenAI’s ChatGPT and Microsoft’s Copilot are now integrating more mailbox features, as demonstrated for instance at Google I/O 2024. These tools promise to enhance your inbox with features that provide more context to your searches and queries.

 

There is a catch – giving LLMs access to user data creates security concerns that put you, your employees, their privacy and your business at risk. A malicious email can turn the helpful Google Gemini into its evil twin that helps attackers socially engineer all kinds of confidential information out of your inbox.

 

We’re going to describe how we successfully employed Gemini to extract confidential information from an email inbox – and then explain how savvy users and security teams can go about stopping this attack.

 

At the moment it’s only possible to conduct the attack we describe with some element of social engineering – basically, the human user has to be convinced to click on a link. It’s not entirely automated.

 

We’re going to demonstrate how a malicious email and a dash of social engineering can trick an unsuspecting user into revealing a secret code used in another email – a common tool for authenticating logins and resetting user passwords.

 

Here’s how it works, assuming the user is employing Gemini to handle a lot of their inbox tasks, rather than reading every email themselves:

  1. Malicious email: An attacker sends an email containing a prompt that instructs Gemini to display a social engineering message to the user, falsely promising an update to a new version of Gemini and requesting a code to activate it.

Gemini is then instructed to find the confidential information the attacker is after (for example, a recovery or MFA code) within the user’s mailbox and display it in base64 format, disguising its true nature. To the user, this looks like a random string of text, but it can be decoded and read with ease. For example, the base64 phrase d3d3LndpdGhzZWN1cmUuY29tCg== translates to www.withsecure.com. For all but the most observant user,the malicious nature of the attack is not immediately clear.

  1. User Interaction with Gemini: The user asks Gemini to summarise their recent emails. 
  2. Prompt injection: When Gemini processes the attacker’s email, the prompt is triggered, causing Gemini to include the phishing message and instructions at the end of its summary. This malicious content appears distinct from the summary, resembling a genuine message from Gemini.
  3. Data compromise: If the user follows the instructions and submits the “activation code,” the attacker gains access to the confidential information.  

To watch the Gemini prompt injection proof of concept happening live, and read a more technical write-up, go to: labs.withsecure.com/publications/gemini-prompt-injection

 

How Google mitigates Gemini attacks

 

Google has been putting a lot of effort into making Gemini as secure as possible against prompt injection attacks. The attack we described had to rely on social engineering techniques and user interaction to succeed. The user must trust and act upon the information provided by Gemini.

 

Other techniques that could be used for automatic data exfiltration without social engineering entirely, or very minimal user interaction (such as just clicking a link), were all stopped successfully. These include:

  • Image-based exfiltration: A common technique for data exfiltration via prompt injection is to coerce the LLM into generating an image with encoded information in its URL, allowing for data exfiltration without user interaction (as the browser will automatically visit the URL as it attempts to display the image). However, we observed that Google had implemented robust safeguards to prevent this. In our tests, any attempt to generate such an image resulted in the chat session being terminated with an error. 
  • URL-based exfiltration: Similarly, attempts to have Gemini generate phishing links containing sensitive information directly in the URL (such as query parameters, subdomains) were unsuccessful. Google’s extensive safeguards appear to effectively vet links produced by Gemini (probably in the same way as image source URLs), preventing data exfiltration through this method. 

What we did: responsible disclosure

 

We disclosed this issue to Google on 19 May 2024. Google acknowledged it as an Abuse Risk on 30 May, and the next day it communicated that its internal team was aware of this issue but would not fix it for the time being. Google’s existing mitigations prevent the most critical data exfiltration attempts already, but stopping social engineering attacks like the one we demonstrated is challenging.

 

How do I stop this happening to me or my employees?

 

In five words: treat LLMs as untrusted entities.

It goes without saying, but it’s clear that many users and some organizations ignore this advice: Exercise caution when using LLM assistants like Gemini or ChatGPT. These tools are undoubtedly useful, but they become dangerous when handling untrusted content from third parties, such as emails, web pages, and documents.

Despite extensive testing and safeguards, the safety of responses cannot be guaranteed when untrusted content enters the LLM’s prompt. 

It’s likely that your organisation already has a policy in place to govern the use of LLMs, especially when it comes to sensitive information like passwords, MFA codes, intellectual property, customer data and commercially-sensitive information – stuff that’s often present in high volumes in users’ email inboxes. That allows for two options:

 

  1. Consider whether your current policy – and any security controls that go with it – is sufficient to protect your organisation from planned or unauthorised use of Gemini and other similar LLM virtual assistants.
  2. Take a programmatic approach: 
    • Use blocklists and machine learning models to identify and filter out malicious or unwanted content – both on the way in and on the way out.
    • Use semantic routing solutions to categorise incoming queries into topics that can then be sorted into content an LLM assistant is permitted to help with, and those that it should not be let near.
    • Assume harmful outputs may still happen: all URLs generated by the LLM should be either blocked or validated against an allowed domain list.
    • Get the security classics/basics right: output encoding to prevent cross-site scripting (XSS) attacks, and a content security policy.
    • Remind users that all LLM-generated answers should be checked and validated, not blindly trusted.

If either of these options leaves you with questions, then it’s time to speak to our consultants to understand how we can help your business.


Visit www.withsecure.com/en/solutions/consulting to find out more.


By Donato Capitella, Principal Security Consultant, WithSecure

Sponsored by WithSecure
Linked InTwitterFacebook
Business Reporter

23-29 Hendon Lane, London, N3 1RT

23-29 Hendon Lane, London, N3 1RT

020 8349 4363

© 2024, Lyonsdown Limited. Business Reporter® is a registered trademark of Lyonsdown Ltd. VAT registration number: 830519543

We use cookies so we can provide you with the best online experience. By continuing to browse this site you are agreeing to our use of cookies. Click on the banner to find out more.
Cookie Settings