Integrate on-device AI models into your app using Core AI - WWDC26 - Videos - Apple Developer
View in English
-->
More Videos
About
Summary
Code
Integrate on-device AI models into your app using Core AI
Discover a curated collection of popular open-source models — including Qwen, Mistral, SAM3, and more — optimized for Apple silicon using the new Core AI Framework. Learn how to download, run, and benchmark models on your Mac, and integrate them into your app with just a few lines of code. Explore a new workflow for model compilation and on-device specialization to speed up first-time model load. Find out how to profile and optimize runtime performance with Core AI tools in Xcode.
Chapters
0:00 - Introduction
1:16 - App concept: camera-based vocab learning
2:52 - Model discovery
7:40 - Getting models with the Core AI models repository
8:37 - Integration
10:55 - Writing the Swift integration code
13:05 - Diagnosing model specialization latency
14:40 - Deployment
17:00 - Ahead-of-time (AOT) compilation
18:03 - iOS demo
19:57 - Multiplatform
23:06 - Next steps
Resources
Core AI PyTorch Extensions
Core AI Python
Core AI Optimization
Core AI
Compiling Core AI models ahead of time
HD Video
SD Video
Related Videos
WWDC26
Explore distributed inference and training with MLX
Explore numerical computing in Swift with MLX
Run local agentic AI on the Mac using MLX
Search this video…
Copy Code<br>11:01 - Load and run SAM3 image segmentation
import CoreAIImageSegmenter
// Load<br>let segmenter = try await ImageSegmenter(resourcesAt: sam3ModelURL)
// Use<br>let response = try await segmenter.segment(image: inputImage, prompt: "flower")<br>let mask = response.segments.first?.mask
Copy Code<br>11:28 - Load a language model and create a session
import FoundationModels<br>import CoreAILanguageModels
// Create model instance<br>let model = try await CoreAILanguageModel(resourcesAt: qwen3ModelURL)
// Create session using the model<br>let session = LanguageModelSession(model: model)
// Generate response<br>let response = try await session.respond(to: "...")
Copy Code<br>12:29 - Generate structured output with @Generable
import FoundationModels<br>import CoreAILanguageModels
@Generable<br>struct VocabCard {<br>let chineseWord: String<br>let englishMeaning: String<br>let exampleSentence: String
let model = try await CoreAILanguageModel(resourcesAt: modelURL)<br>let session = LanguageModelSession(model: model)<br>let response = try await session.respond(<br>to: "Create a vocab card for flower",<br>generating: VocabCard.self<br>let card: VocabCard = response.content
Copy Code<br>17:22 - Compile a Core AI model ahead of time
$ xcrun coreai-build compile MyModel.aimodel --platform iOS
0:00 - Introduction
Overview of Core AI — a new set of technologies that lets you bring advanced on-device AI capabilities to your apps with no server, no cost per token, and no cloud latency.
1:16 - App concept: camera-based vocab learning
Introduction to the demo app — an iOS language-learning app where students point their camera at real-world objects to generate vocab cards with translations, example sentences, and segmented images, all running on-device.
2:52 - Model discovery
How to define your app's AI requirements — content, language, and device constraints — and select the right models: SAM3 for text-prompted image segmentation and Qwen 0.6B (a 119-language reasoning model) for vocab card generation.
7:40 - Getting models with the Core AI models repository
How to use the coreai-models GitHub repository to find popular models with ready-made export recipes — browsing the catalog, running the export script for SAM3 and Qwen, and getting optimized .aimodel files.
8:37 - Integration
How to inspect .aimodel files in Xcode (size, platform targets, function signatures, tensor shapes), add the coreai-models Swift package, and select the CoreAILM and CoreAISegmentation libraries as app dependencies.
10:55 - Writing the Swift integration code
How to write the Swift code to use both models — loading SAM3 and running text-prompted segmentation, loading Qwen with a single CoreAILanguageModel line, and using the familiar LanguageModelSession API from Foundation Models with structured @Generable output for typed vocab card fields.
13:05 - Diagnosing model specialization latency
Using the new Core AI Instruments template to identify that first-run latency is caused by model specialization — the process that compiles a Core AI model for the specific device — and understanding when and how to handle it gracefully.
14:40 - Deployment
How to design a deliberate deployment strategy: using a first-run experience to introduce the feature, keeping models out of the app bundle to avoid bloating update size for all users, and triggering on-demand model download via Background Assets only when the user opts in.
17:00 - Ahead-of-time (AOT) compilation
How to use the coreai-build command to perform compilation ahead-of-time on your development machine — generating...