How to Use WindowSill with Ollama for Private AI Writing

How to Use WindowSill with Ollama for Private AI Writing | WindowSill Blog

Some things shouldn't leave your computer. Medical notes, legal drafts, journal entries, messages to your therapist, that honest performance review you're still editing. When you paste those into a cloud AI service, they travel to a server you don't control, get processed by infrastructure you can't inspect, and may end up in training data you never agreed to.

That bothered me enough to build local LLM support into WindowSill. And one of the easiest way to set it up is with Ollama, a free tool that runs AI models directly on your hardware.

This guide walks you through the setup. By the end, you'll have grammar checking, text rewriting, tone adjustment, and translation running locally in any Windows app, with zero data leaving your machine.

What you'll need

WindowSill installed from the Microsoft Store (free tier is fine for setup, but AI features require WindowSill+)

Ollama installed from ollama.com

8 GB of RAM minimum (16 GB recommended for comfortable use alongside other apps)

A few gigabytes of disk space for the model you choose

No GPU required. Ollama runs on CPU too, just slower. If you have an NVIDIA GPU with 6+ GB of VRAM, responses will be noticeably faster.

Step 1: Install and start Ollama

Download Ollama from ollama.com and run the installer. Once installed, Ollama runs as a background service on your machine. You can verify it's working by opening a terminal and running:

ollama --version

If you see a version number, you're good.

Step 2: Pull a model that's good for writing

Not all models are equal for writing tasks. You want something that understands grammar, tone, and natural language well. Here are three solid options:

Model Size Best for Pull command

Deepseek R1 8B ~8 GB General writing, grammar, rewriting ollama pull deepseek-r1:8b

Qwen 3.5 4B ~4.5 GB Lighter machines, still capable ollama pull qwen3.5:4b

Deepseek R1 1.5B ~1 GB Fast responses, good grammar ollama pull deepseek-r1:1.5b

Open a terminal and pull whichever model fits your hardware:

ollama pull qwen3.5:4b

This downloads the model to your machine. It only needs to happen once.

A note on model size vs. quality: Larger models (13B+ parameters) produce better writing output, but they need more RAM and a decent GPU to run at a usable speed. For most writing tasks (grammar fixes, tone adjustments, short rewrites), an 8B model is more than enough. Start small and upgrade if you need to.

Step 3: Connect WindowSill to Ollama

Ollama exposes a local API at http://localhost:11434. WindowSill can connect to it automatically.

In WindowSill:

Open the Settings from the command bar

Go to the AI Writing & Analysis section

Under AI Providers, select Ollama

Set the Ollama access point to http://localhost:11434

Select the model you pulled (e.g., qwen3.5:4b)

That's it. WindowSill now routes AI requests to Ollama instead of a cloud service.

Step 4: Test it

Open any app where you write (Word, Outlook, Notion, Slack, a browser, anything). Type a sentence with a deliberate mistake:

Their going to the meeting tommorrow at 3pm, can you confirmed?

Select the text. WindowSill's Analyze / Rewrite sill should appear on the bar. Hit the Spell Check option.

If everything is connected, the corrected text will come back after a few seconds:

They're going to the meeting tomorrow at 3 PM. Can you confirm?

The first request might take a moment while Ollama loads the model into memory. Subsequent requests will be faster.

What you can do with local AI

Once connected, all of WindowSill's AI writing features work through your local model:

Grammar and spell check. Select text in any app, fix errors without opening a separate tool.

Rewriting. Highlight a paragraph and ask for a rewrite. Useful for polishing drafts or simplifying dense writing.

Tone adjustment. Switch between professional, casual, and attention-grabbing tones. You can also create custom tone presets for recurring needs (e.g., "customer support reply" or "executive summary").

Translation. Select text and translate to any of 35+ supported languages. The quality depends on the model you chose. Llama 3 handles common language pairs (English/Spanish, English/French, English/German) well. For less common pairs, a larger model or a specialized translation model works better.

Custom prompts. Build reusable prompts with variable injection. For example, a prompt that takes selected text and converts it into a formatted meeting recap with today's date auto-inserted.

Summarization. Select a long email or document section and get a summary.

All of this happens on your machine. Nothing goes to OpenAI, Google, or anyone else.

Performance tips

Local models are slower than cloud APIs. Here's how to keep things comfortable:

Close the model when you're not using it. Ollama keeps models in memory. If you're done writing and need the RAM for something else, run ollama stop qwen3.5:4b in a...

How to Use WindowSill with Ollama for Private AI Writing

Related Articles

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI