Apple-FM – a command-line interface for Apple's on-device models

brianwestphal1 pts0 comments

apple-fm - npm

npm

Search<br>Sign UpSign In

apple-fm

1.0.1 • Public • Published a day ago<br>Readme<br>Code Beta<br>0 Dependencies<br>2 Dependents<br>4 Versions<br>apple-fm

Apple Intelligence from your command line and your code. apple-fm gives you<br>the on-device Foundation Models on macOS 26+ / Apple Silicon — free,<br>private, and fully offline. No API key, no network, nothing leaves your Mac.

Apple ships these models as a Swift-only framework with no command-line<br>front-end. apple-fm provides one — a fast, lightweight CLI and a TypeScript<br>library, so you can use the on-device model from your terminal or your code.

Why apple-fm

On-device & private. Runs on the model Apple Intelligence already<br>installed. No key, no cloud, no telemetry — your prompts never leave the<br>machine.

One binary, three shapes. probe, one-shot generate (freeform, guided,<br>or streamed), and an interactive chat — all over a single<br>NDJSON protocol.

Guaranteed structured output. Hand it a JSON Schema and the output is<br>guaranteed to conform — native guided generation built on Apple's<br>DynamicGenerationSchema, not best-effort prompting. Stream it, too.

Long conversations stay coherent — and cheap. chat holds one on-device<br>session across turns, reusing the model's cache instead of replaying the<br>transcript, and automatically summarizes older turns as the small context<br>window fills.

Lightweight, nothing to audit. Zero runtime dependencies — apple-fm just<br>talks to the on-device model and gets out of your way.

Future-proof. The helper resolves the current on-device model at runtime,<br>so OS and model updates are picked up without a rebuild.

Install

npm install -g apple-fm # CLI<br>npm install apple-fm # library

Requires macOS 26+ on Apple Silicon with Apple Intelligence enabled. Every<br>release is Developer-ID signed and notarized, so it runs without security<br>prompts.

Platform support

apple-fm installs on any OS — so it can be a dependency of a cross-platform<br>project — but the on-device model only runs on macOS 26+ Apple Silicon .<br>Everywhere else it degrades gracefully instead of crashing: probe() returns<br>{ available: false, reason: 'unsupportedPlatform' }, and generate / chat<br>throw a clear [unsupportedPlatform] error. Gate on isPlatformSupported() or<br>probe():

import { isPlatformSupported, probe, generate } from 'apple-fm';

if ((await probe()).available) {<br>await generate({ prompt: '…' });<br>} else {<br>// fall back (cloud model, cached result, skip the feature, …)

CLI

apple-fm probe # is the on-device model available?<br>apple-fm generate "Summarize: …" # one-shot text<br>cat notes.md | apple-fm generate # read the prompt from stdin<br>apple-fm generate "…" --stream # stream tokens as they arrive<br>apple-fm generate "…" --schema shape.json # structured/guided JSON output<br>apple-fm chat # interactive chat (streamed, auto-compacted)

Run apple-fm --help for the full flag list.

Check availability

Before generating, confirm Apple Intelligence is ready on this machine.

Structured output

Pass a JSON Schema with --schema and apple-fm returns JSON guaranteed to<br>conform to it — native guided generation, not prompt-and-hope — ready to pipe<br>into the rest of your tooling.

Interactive chat

chat is a multi-turn REPL that streams replies and compacts the transcript<br>automatically near the context window. Built-in slash commands: /reset,<br>/system, /clear, /compact, /help, /quit.

Library

import { probe, generate, ChatSession } from 'apple-fm';

if ((await probe()).available) {<br>// One-shot<br>const summary = await generate({ prompt: 'Summarize: …', system: 'Be terse.' });

// Streaming<br>await generate({ prompt: '…', stream: true }, {}, (chunk) => process.stdout.write(chunk));

// Guaranteed structured output — pass a JSON Schema, get conforming JSON back<br>const json = await generate({ prompt: 'A classic sci-fi novel', schema: novelSchema });

// Multi-turn chat: one persistent on-device session, auto-compacted<br>const chat = new ChatSession({ system: 'You are a helpful assistant.' });<br>const reply = await chat.send('Hello');

The full API (probe, generate, ChatSession, protocol helpers, and types) is<br>documented in docs/ai/code-summary.md.

How it works

apple-fm talks to the same on-device model that powers Apple Intelligence —<br>nothing is sent to the cloud. For chat it keeps one model session alive across<br>turns so replies stay fast, and automatically summarizes older messages as the<br>context window fills, so long conversations stay coherent and cheap. And because<br>it resolves the current on-device model at runtime, OS and model updates are<br>picked up automatically — no reinstall.

Documentation

Overview

Disclaimer

apple-fm is an independent, unofficial project and is not affiliated with,<br>endorsed by, or sponsored by Apple Inc. It simply provides command-line and<br>programmatic access to Apple's on-device Foundation Models on machines that<br>already support them (macOS 26+ on Apple Silicon with Apple Intelligence enabled).<br>Apple, Apple Intelligence, Apple Silicon, Foundation Models, and macOS...

apple device model generate chat probe

Related Articles