The future of Siri, or: why private inference isn't private enough

The future of Siri, or: why private inference isn’t private enough – A Few Thoughts on Cryptographic Engineering

Home

The future of Siri, or: why private inference isn’t private enough

Matthew Green<br>in Apple, privacy

June 9, 2026June 9, 2026

2,854 Words

Yesterday Apple announced a big step towards deploying real AI in their Siri ecosystem. In most ways this is good and inevitable: Siri is one of the world’s most widely-used voice agents, and it would be good if it didn’t suck. The idea that Apple would boost its capabilities with frontier models wasn’t so much a matter of if, but a question of when and who.

The who turns out to be Google: Apple looks like it will use some combination of Google Gemini models, combined with Google’s Confidential Inference and Apple’s own Private Cloud Compute for private hosting. These systems will process both your queries and evaluate private data from your devices. Apple’s marketing pitches the advantages as follows:

First, since your phone already has context about you — meaning, your private information, schedules, email, text messages — an AI-enabled Siri can potentially offer more useful answers to your practical requests than external LLMs. Want to schedule a reservation for next week’s birthday party? In theory, a future Siri-AI might already know who’s coming, and what kind of cake they like.

Of course, what Apple calls "context" is also the raw data of your life. This is deeply private data from all of your apps, and that data can’t just be shipped to random adtech companies (or Sam Altman) for processing. Your context needs to be protected, and Apple bills itself as a privacy company.

There’s some tension between these goals. Apple has addressed this by marketing a service it calls Private Cloud Compute, or PCC. PCC was introduced in 2024 as a private model inference system that ran entirely on Apple Silicon, using a set of "trusted" hardware security modules running in Apple’s datacenters. The goal of this system is to ensure that your data never leaves Apple’s hardware: it’s encrypted from your phone to a dedicated server, and then it disappears once a response reaches your phone. The stateless design of PCC ensures (in theory) that your data doesn’t linger, and the design of the hardware prevents even Apple from seeing the inputs.

Apple has since "expanded" PCC to encompass Google’s hardware as well. I will confess that I find the details of the new "expanded" PCC just a bit vague. It sounds a lot like Apple is primarily going to rely on Google’s existing confidential compute (running in Google datacenters) to process this data, but they’re bolting on a new layer of technical security to control which models are actually running. In any case: security experts can argue about whether this is good enough to keep Cozy Bear away from your data. What I will grant is that it’s probably good enough to keep Google and Apple from accessing your stuff, which is what most people are worried about in the first place.

So why am I so nervous?

A brief scenario involving private agents

To illustrate how agents might work, it’s helpful to consider an example use case. Let’s imagine that you’re planning a business dinner for six people. This involves several subtasks:

You need to juggle the participants’ schedules, know when they’re in town and available to meet.

You need to choose the appropriate restaurant based on menu and location. This might depend on what you know about the participants’ preferences: Mike is wildly allergic to szechuan peppercorn, for example, which rules out quite a few options.

With these time/cuisine/location constraints in place, you’ll need to search for a restaurant that actually has a table for six in the right place.

Finally, you’ll need to book the reservation, mark your calendar, and alert your attendees.

In the past, this type of scheduling required a significant amount of human effort. The beauty of AI agents is that, in theory, this is exactly the sort of project that can be automated. The agent can first scan your recent conversations to answer the questions needed for steps (1) & (2), then it can conduct the searches described in step (3). With a nod from you, it can even author the calendar invites and text messages required to complete step (4).

So what’s the problem here?

The first and unsurprising observation is that being useful on these tasks requires your agent to have context, which means: relatively unrestricted access to your private data. You know about your invitees’ availability because they texted it to you. You know about Mike’s allergy because you’ve talked about it with him or jotted it down somewhere. (This could mean iMessages, email, contacts, or personal notes.) Re-entering all of this data into an agent would be annoying and time consuming and the whole point of an agent is to save you time. The winning personal assistant doesn’t win just because it’s smart: it wins because...

The future of Siri, or: why private inference isn't private enough

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs