Show HN: I ported 11 model families to Apple's new on-device AI framework

mlboy1 pts0 comments

GitHub - john-rocky/coreai-model-zoo: Community model zoo + knowledge base for Apple Core AI (iOS/macOS 27): Qwen3.5 & Gemma 4 converted end-to-end, verified on-device (iPhone 17 Pro GPU/ANE), conversion gotchas, custom Metal kernels, Swift runner Β· GitHub

/" data-turbo-transient="true" />

Skip to content

Search or jump to...

Search code, repositories, users, issues, pull requests...

-->

Search

Clear

Search syntax tips

Provide feedback

--><br>We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

-->

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Sign in

/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

{{ message }}

john-rocky

coreai-model-zoo

Public

Notifications<br>You must be signed in to change notification settings

Fork

Star<br>47

main

BranchesTags

Go to file

CodeOpen more actions menu

Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit

History<br>66 Commits<br>66 Commits

apps

apps

conversion

conversion

knowledge

knowledge

official

official

swift

swift

zoo

zoo

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

View all files

Repository files navigation

CoreAI-Model-Zoo

LLMs converted to Apple Core AI (.aimodel, iOS 27 / macOS 27) β€” downloadable, verified<br>on-device, with the conversion code and a knowledge base. Successor to<br>CoreML-Models.

Models

Model<br>Download (.aimodel)<br>License

Qwen3.5-0.8B<br>πŸ€— qwen3.5-0.8B-CoreAI<br>Apache-2.0

Qwen3.5-2B<br>πŸ€— qwen3.5-2B-CoreAI<br>Apache-2.0

Qwen3.6-35B-A3B (MoE, Mac-only)<br>πŸ€— Qwen3.6-35B-A3B-CoreAI<br>Apache-2.0

Gemma 4 E2B (text, incl. official-QAT int4)<br>πŸ€— gemma-4-E2B-CoreAI<br>Gemma

Gemma 4 E4B (text, official-QAT int4)<br>πŸ€— gemma-4-E4B-CoreAI<br>Gemma

LFM2.5-1.2B-Instruct<br>πŸ€— LFM2.5-1.2B-CoreAI<br>LFM Open License v1.0

Granite 4.0-H 1B / 350M<br>πŸ€— granite-4.0-h-CoreAI<br>Apache-2.0

Qwen3-VL (vision-language)<br>πŸ€— 2B Β· 4B Β· 8B<br>Apache-2.0

Gemma 4 E2B vision (VL) (image+text)<br>vl/ in πŸ€— gemma-4-E2B-CoreAI<br>Gemma

RF-DETR nano/small/medium/large (object detection, no NMS)<br>πŸ€— RF-DETR-CoreAI<br>Apache-2.0

Decode throughput (tok/s, greedy; output top-1 exact vs the Hugging Face reference)

iPhone 17 Pro Β· GPU<br>iPhone 17 Pro Β· ANE<br>M4 Max Β· GPU

Qwen3.5-0.8B<br>71.9<br>14.7<br>210

Qwen3.5-2B<br>29<br>161

LFM2.5-1.2B<br>45.4<br>276.5

Granite 4.0-H 1B<br>36.3<br>136.5

Gemma 4 E2B<br>30.3 (QAT 30.7)<br>77.0 (QAT 78.9)

Gemma 4 E4B (official QAT)<br>15.1<br>55.8

Gemma 4 E2B VL (image+text, official QAT)<br>25.5<br>82.4

Qwen3.6-35B-A3B (MoE, 35B/~3B active, Mac-only)<br>30.9

Measured on the iOS 27 / macOS 27 beta, Apple's coreai-pipelined GPU engine, zero custom<br>kernels (ANE column excepted). Prefill, sizes, per-model caveats: zoo/.

Qwen3.6-35B-A3B (MoE, 35B/~3B active) β€” 30.9 tok/s is expert-gather-bound in the<br>current beta; zoo/qwen3.6.md

RF-DETR β€” 33–39 FPS live on iPhone 17 Pro, 8.6–19.1 ms/frame on M4 Max;<br>zoo/rf-detr.md

Gemma 4 E2B VL β€” same text decoder + a 3-line image splice;<br>zoo/gemma4-vl.md

CoreAIChat (apps/) β€” the zoo's models running on-device on iPhone.

Repository layout

Dir<br>What

zoo/<br>Model cards β€” configurations, sizes, parity, measured throughput.

knowledge/<br>Verified notes on the framework: conversion, compression, stateful KV, custom Metal kernels, AOT, compute-unit rules, the Swift runtime.

conversion/<br>Re-authored models + convert / verify / compress scripts (PyTorch β†’ .aimodel).

swift/<br>CoreAIRunner β€” a Swift package that drives .aimodel LLM bundles, including architectures beyond the standard runtime.

apps/<br>SwiftUI on-device chat apps (iOS 27): CoreAIChat (Gemma 4 E2B GPU/ANE/⚑ + Qwen3.5 / Qwen3.5-2B / LFM2.5 / Granite ⚑pipelined, one picker) + QwenChatFast (Qwen3.5 static kernels) with in-app model download.

Start here

Run a model on device β†’ knowledge/swift-runtime.md + the model card

Convert a model β†’ knowledge/conversion-guide.md

Compress β†’ knowledge/compression.md

Make it fast β†’ knowledge/custom-metal-kernels.md Β· knowledge/performance-ceiling.md

Known beta issue (in-graph KV-write crash; workarounds + the input-mask escape) β†’ knowledge/coreai-beta-mpsgraph-kvwrite-bug.md β€” FB23024751 / apple/coreai-models#5

License

BSD-3-Clause (LICENSE). Re-authored model code derives from Apple's BSD-3-Clause<br>coreai_models and retains its notices. Model weights follow their own licenses (see each<br>Hugging Face repo).

About

Community model zoo + knowledge base for Apple Core AI (iOS/macOS 27): Qwen3.5 & Gemma 4 converted end-to-end, verified on-device (iPhone 17 Pro GPU/ANE), conversion gotchas, custom Metal kernels, Swift...

qwen3 gemma model coreai knowledge conversion

Related Articles