Show HN: I ported 11 model families to Apple's new on-device AI framework

GitHub - john-rocky/coreai-model-zoo: Community model zoo + knowledge base for Apple Core AI (iOS/macOS 27): Qwen3.5 & Gemma 4 converted end-to-end, verified on-device (iPhone 17 Pro GPU/ANE), conversion gotchas, custom Metal kernels, Swift runner · GitHub

/" data-turbo-transient="true" />

Search or jump to...

Search code, repositories, users, issues, pull requests...

-->

Clear

Search syntax tips

Provide feedback

--> We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

-->

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

/;ref_cta:Sign up;ref_loc:header logged out"}" Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

john-rocky

coreai-model-zoo

Public

Notifications You must be signed in to change notification settings

Fork

Star 47

main

BranchesTags

Go to file

CodeOpen more actions menu

Folders and files NameNameLast commit message Last commit date Latest commit

History 66 Commits 66 Commits

apps

conversion

knowledge

official

swift

zoo

.gitignore

LICENSE

README.md

View all files

Repository files navigation

CoreAI-Model-Zoo

LLMs converted to Apple Core AI (.aimodel, iOS 27 / macOS 27) — downloadable, verified on-device, with the conversion code and a knowledge base. Successor to CoreML-Models.

Models

Model Download (.aimodel) License

Qwen3.5-0.8B 🤗 qwen3.5-0.8B-CoreAI Apache-2.0

Qwen3.5-2B 🤗 qwen3.5-2B-CoreAI Apache-2.0

Qwen3.6-35B-A3B (MoE, Mac-only) 🤗 Qwen3.6-35B-A3B-CoreAI Apache-2.0

Gemma 4 E2B (text, incl. official-QAT int4) 🤗 gemma-4-E2B-CoreAI Gemma

Gemma 4 E4B (text, official-QAT int4) 🤗 gemma-4-E4B-CoreAI Gemma

LFM2.5-1.2B-Instruct 🤗 LFM2.5-1.2B-CoreAI LFM Open License v1.0

Granite 4.0-H 1B / 350M 🤗 granite-4.0-h-CoreAI Apache-2.0

Qwen3-VL (vision-language) 🤗 2B · 4B · 8B Apache-2.0

Gemma 4 E2B vision (VL) (image+text) vl/ in 🤗 gemma-4-E2B-CoreAI Gemma

RF-DETR nano/small/medium/large (object detection, no NMS) 🤗 RF-DETR-CoreAI Apache-2.0

Decode throughput (tok/s, greedy; output top-1 exact vs the Hugging Face reference)

iPhone 17 Pro · GPU iPhone 17 Pro · ANE M4 Max · GPU

Qwen3.5-0.8B 71.9 14.7 210

Qwen3.5-2B 29 161

LFM2.5-1.2B 45.4 276.5

Granite 4.0-H 1B 36.3 136.5

Gemma 4 E2B 30.3 (QAT 30.7) 77.0 (QAT 78.9)

Gemma 4 E4B (official QAT) 15.1 55.8

Gemma 4 E2B VL (image+text, official QAT) 25.5 82.4

Qwen3.6-35B-A3B (MoE, 35B/~3B active, Mac-only) 30.9

Measured on the iOS 27 / macOS 27 beta, Apple's coreai-pipelined GPU engine, zero custom kernels (ANE column excepted). Prefill, sizes, per-model caveats: zoo/.

Qwen3.6-35B-A3B (MoE, 35B/~3B active) — 30.9 tok/s is expert-gather-bound in the current beta; zoo/qwen3.6.md

RF-DETR — 33–39 FPS live on iPhone 17 Pro, 8.6–19.1 ms/frame on M4 Max; zoo/rf-detr.md

Gemma 4 E2B VL — same text decoder + a 3-line image splice; zoo/gemma4-vl.md

CoreAIChat (apps/) — the zoo's models running on-device on iPhone.

Repository layout

Dir What

zoo/ Model cards — configurations, sizes, parity, measured throughput.

knowledge/ Verified notes on the framework: conversion, compression, stateful KV, custom Metal kernels, AOT, compute-unit rules, the Swift runtime.

conversion/ Re-authored models + convert / verify / compress scripts (PyTorch → .aimodel).

swift/ CoreAIRunner — a Swift package that drives .aimodel LLM bundles, including architectures beyond the standard runtime.

apps/ SwiftUI on-device chat apps (iOS 27): CoreAIChat (Gemma 4 E2B GPU/ANE/⚡ + Qwen3.5 / Qwen3.5-2B / LFM2.5 / Granite ⚡pipelined, one picker) + QwenChatFast (Qwen3.5 static kernels) with in-app model download.

Start here

Run a model on device → knowledge/swift-runtime.md + the model card

Convert a model → knowledge/conversion-guide.md

Compress → knowledge/compression.md

Make it fast → knowledge/custom-metal-kernels.md · knowledge/performance-ceiling.md

Known beta issue (in-graph KV-write crash; workarounds + the input-mask escape) → knowledge/coreai-beta-mpsgraph-kvwrite-bug.md — FB23024751 / apple/coreai-models#5

License

BSD-3-Clause (LICENSE). Re-authored model code derives from Apple's BSD-3-Clause coreai_models and retains its notices. Model weights follow their own licenses (see each Hugging Face repo).

About

Community model zoo + knowledge base for Apple Core AI (iOS/macOS 27): Qwen3.5 & Gemma 4 converted end-to-end, verified on-device (iPhone 17 Pro GPU/ANE), conversion gotchas, custom Metal kernels, Swift...

Show HN: I ported 11 model families to Apple's new on-device AI framework

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs