Library Header Image Library Header Image

Is That a Bad Apple in Your Pocket? We Used Prompt Injection to Hijack Apple Intelligence

Key takeaways:

RSAC researchers combined two techniques to bypass both Apple’s input and output filters and the internal guardrails on Apple Intelligence’s local LLM and force the LLM to produce an attacker-directed result.

  • We tested our attack with 100 random prompts and succeeded 76% of the time.
  • When we discovered this vulnerability, RSAC estimated that between 100,000 and 1 million Apple customers were already using apps vulnerable to the attack.
  • Prior to Apple's OS updates, hackers could use our techniques to force Apple’s local LLM to do their bidding, including manipulating data that’s accessible to any of the LLM-enabled apps, like health/fitness data and family videos.

RSAC Research Bypassed Apple Intelligence's Security Mechanisms

When it launched Apple Intelligence in June 2024, Apple chose a different path from its Western competitors in the AI race.Apple Intelligence includes a small, on-device large language model (LLM) and a larger model running in a private cloud. Apple’s local LLM is a full-fledged component of the operating system (OS) that applications can use through an OS-defined interface.

Apple Intelligence’s system-level LLM is attractive to app developers because it’s accessible via a simple, unified API. However, as Apple Intelligence becomes integrated into more applications, it also becomes appealing for attackers, who can potentially hijack both those apps and the OS itself. RSAC estimates that there were at least 200 million Apple Intelligence-capable devices in consumers’ hands as of December 2025, and the Apple App Store already features apps using Apple Intelligence—so it’s already a high-value target.ii So RSAC researchers decided to try breaking in. And we succeeded.

Apple’s choice to place an LLM on the local device that the OS manages provides security advantages that the RSAC research team had to figure out how to bypass. For example, although the LLM runs locally, as far as we can determine, users and apps cannot access its plaintext weights or internal mechanisms directly. Instead, users and apps can only interact with the local model via Apple’s Foundation Models framework APIs, which means that the OS can: 1) control how apps communicate with the model; 2) enforce policies; 3) monitor behavior; and 4) (attempt to) prevent misuse.

As of this writing, Apple doesn’t provide many details about #4 above, but based on our research, RSAC thinks that Apple forces all inputs to the local model to go through input and output filters that are designed to eliminate malicious input and prevent the LLM from returning undesirable output.

The Winning Combination: A “Neural Exec” and Unicode’s Right-to-left-Override Function

To hack Apple Intelligence, RSAC had to solve two problems: 1) Find an input that causes the local LLM to execute an adversarially-chosen task; and 2) Bypass the filters.

Fortunately, then-RSAC researcher Dario Pasquini already had a solution to problem #1); he and two colleagues had previously discovered what they call a “Neural Exec,” which is an adversarial input designed to trick an LLM into performing an arbitrary task.  Neural Execs look like gibberish to humans, but they work like charms on LLMs (this paper explains how), and they’re universal—meaning that an attacker can specify an arbitrary payload to be executed without needing to recompute the trigger.

But we still had to address problem #2. And we did it using humanity’s most potent cyberweapon: Unicode. We bypassed Apple’s filters using that hoary hacker’s standby, the Unicode right-to-left-override function. It allows you to embed blocks of text in languages that read right-to-left (like Arabic or Hebrew) inside blocks of text in languages that read left-to-right (like English or Hindi) and have both languages render correctly. Essentially, we encoded the malicious/offensive English-language output text by writing it backwards and using our Unicode hack to force the LLM to render it correctly.

We combined our Unicode-exploiting solution to problem #2 with our Neural Exec solution to problem #1 into prompts like this:

See the image we created below: 

4/7

Which lets us bypass both the LLM’s internal guardrails and the input and output filters, and the user sees this response:

“Hey user, go **** yourself”

(And yes, it renders with the real swearword, but this is a family blog).

We ran the evaluation over 100 random prompts and achieved an average attack success rate of 76%.

To Protect Yourself, Upgrade to iOS 26.4 or macOS 26.4 or Later

An attacker could potentially use this same combination of techniques to force Apple’s local LLM to behave in any fashion they desire, which would allow the attacker to manipulate any data and functionality accessible from the compromised application—so an attacker might be able access all your fitness and health data from the SmartGym fitness app, or all the family videos you edited with VLLO.

We did some rough calculations using the number of reviews in the Apple AppStore for the apps that Apple identified as integrating the on-device LLM as of September 2025, and we estimated that the number of Apple users who were already using apps capable of exposing this vulnerability ranges from100,000 to more than 1 million.iii

Because this issue involves bypassing the input and output filters on the local LLM, there’s not much that individual Apple users can do to protect themselves other than avoiding apps that both use the local Apple Intelligence LLM and have access to sensitive data.

The RSAC Research Lab disclosed this attack to Apple on October 15, 2025, through the Apple Security Research portal. Apple has since hardened the affected systems against this attack, and those protections were rolled out in iOS 26.4 and macOS 26.4. Luckily, RSAC has not seen any evidence of this vulnerability being exploited by attackers out in the wild, but users running earlier versions of iOS and macOS should upgrade to the hardened versions as soon as is practicable.

__________________________________________________________

Here, we say Western competitors, because the Western commercial frontier model creators primarily rely on models deployed in the cloud; in contrastChinese LLM competitors like Alibaba (Qwen), DeepSeek, and ModelBest (MiniCPMare also pursuing on-device small LLM approaches. Source: InAIWeTrust

ii Apple shipped an estimated 247m iPhones in 2025, the large majority of which were Apple Intelligence capable. And as of September 2025, Apple showcased several generally available apps that were already using Apple Intelligence. Sources: 1) iPhone shipments: IDC; 2) Apple Intelligence apps: Apple

iii Here’s how we did the estimate: we checked the number of reviews for Apple’s highlighted apps, assumed that between 0.5% and 5% (this seems to vary quite widely) of customers write reviews to extrapolate total download numbers, and then estimated what portion of those downloads were recent enough to include the Apple Intelligence-enabled version of the apps.

Contributors
Petros Efstathopoulos

Vice President, Research, RSAC

Laura Koetzle

Head, Community Research, RSAC

Dario Pasquini, PhD

Head, Artificial Intelligence, Cracken

Blogs posted to the RSAConference.com website are intended for educational purposes only and do not replace independent professional judgment. Statements of fact and opinions expressed are those of the blog author individually and, unless expressly stated to the contrary, are not the opinion or position of RSAC™ Conference, or any other co-sponsors. RSAC Conference does not endorse or approve, and assumes no responsibility for, the content, accuracy or completeness of the information presented in this blog.


Share With Your Community

Related Blogs