A 1B humanizer that matches human writing on an AI detector

codelion1 pts0 comments

A 1B humanizer that matches human writing on an AI detector · mlx-optiq

Engineering · June 1, 2026<br>A 1B humanizer that matches human writing on an AI detector.

Topic Fine-tuning · Alignment<br>Reading time 6 min<br>Related sensitivity-aware LoRA

Two LoRA adapters stacked on a 1B model close 100 % of the gap to human writing on the RADAR AI detector. Source AI drafts score P(AI) = 0.51. The human reference scores 0.37. Our stacked SFT + DPO LoRAs on MiniCPM5-1B-OptiQ-4bit land at 0.37 — exactly the human number. Everything runs locally on a 24 GB Mac with the v0.1.4 release of OptIQ.

One 875 MB base model, two 120 MB LoRA adapters, one HTTP request body that says "adapter": "sft+dpo". RADAR can no longer tell the rewrites from the human references.

The result

200 held-out AI drafts from the EditLens ICLR 2026 corpus, rewritten by each system, then scored by RADAR-Vicuna-7B. Lower P(AI) means more human-like.

PipelineP(AI) ↓Δ vs sourceGap closed

Source AI draft (Qwen3.5-4B + gemma-4-e4b)0.51——<br>MiniCPM5-1B + SFT humanizer LoRA0.50-0.017 %<br>MiniCPM5-1B + SFT + DPO LoRAs stacked 0.37-0.14100 %<br>Human reference (target)0.37-0.14100 %

The full slop-phrase frequency (boilerplate AI patterns like "a testament to", "underscores the importance of") drops from 0.6 per 1 K tokens in the source to 0.0 in the stacked output — actually lower than the human reference's 0.1.

Why a 1 B model can do this matters: the base model is 875 MB on disk, the two adapters are 120 MB each, and everything runs locally on consumer hardware. You don't need a 70 B humanizer behind an API key for this.

The recipe in three commands

OptIQ 0.1.4 ships every piece. The full pipeline is:

terminal · 1. quantizebash

$ pip install 'mlx-optiq>=0.1.4'

$ optiq convert openbmb/MiniCPM5-1B \<br>--target-bpw 5.0 --candidate-bits 4,8 \<br>--output ./optiq_mixed

Sensitivity-aware mixed-precision quantization. Most layers land at 4-bit, the sensitive ones at 8-bit. Result is 875 MB and only 1.06 GB short of the bf16 base on Capability Score (eval framework).

terminal · 2. train SFT, then DPO continuing from itbash

$ optiq lora train ./optiq_mixed \<br>--data ./sft_dataset --method sft \<br>--preset large --iters 600 \<br>--output ./adapters/humanizer-sft

$ optiq lora train ./optiq_mixed \<br>--data ./dpo_dataset --method dpo \<br>--preset large --iters 300 \<br>--mount-adapter ./adapters/humanizer-sft \<br>--output ./adapters/humanizer-dpo

The --mount-adapter flag is the textbook SFT → DPO continuation recipe. It stacks a frozen SFT LoRA alongside a trainable DPO LoRA on every adapted layer. During training, the DPO reference forward zeroes only the trainable scale — so the KL term is anchored against base + SFT (the SFT model), which is the standard alignment-pipeline definition of "DPO continuing from SFT." The saved adapter contains only the DPO delta.

terminal · 3. serve both, stacked per-requestbash

$ optiq serve --model ./optiq_mixed \<br>--adapter ./adapters/humanizer-sft \<br>--adapter ./adapters/humanizer-dpo

# request body activates both with the "+" operator:<br>$ curl localhost:8080/v1/chat/completions \<br>-d '{"model":"...","messages":[...],<br>"adapter":"humanizer-sft+humanizer-dpo"}'

optiq serve mounts both adapters on a single base. The request body's adapter field, given an a+b form, applies both LoRA residuals simultaneously — a single base, two adapters, one inference pass. The classic single-adapter syntax ("adapter": "humanizer-sft") still works, and so does the sentinel "adapter": "base" to bypass adapter activation entirely. Useful for A/B comparisons from the same served process.

Why the stack beats either adapter alone

The SFT adapter alone scores P(AI) = 0.50 — barely better than the source. The DPO adapter on its own is meaningless: it's trained as a delta from SFT, not an absolute LoRA. Without SFT active, you're applying a small perturbation to the base model that doesn't recover the SFT distribution.

The stack, by construction, reproduces the training-time forward pass exactly:

StackedLoRALinear · trainingpython

# during training (frozen SFT + trainable DPO):<br>y = base(x) + sft_scale * (x @ sft_a @ sft_b)<br>+ dpo_scale * ((dropout(x) @ lora_a) @ lora_b)

# at serve time (both mounted, "a+b" syntax):<br>y = base(x) + sft_scale * (x @ sft_a @ sft_b)<br>+ dpo_scale * (x @ lora_a @ lora_b)

Same math, same weights, same distribution. That's why the inference numbers match what the training trajectory promised.

Try it

Everything (base, both adapters, model card, held-out eval) is bundled into a single Hugging Face repo: mlx-community/humanizer-1B-OptIQ-4bit (~1.1 GB). Download once, serve with both adapters stacked:

terminal · use the published artifactbash

$ pip install 'mlx-optiq>=0.1.4'

$ huggingface-cli download mlx-community/humanizer-1B-OptIQ-4bit \<br>--local-dir ./humanizer-1B-OptIQ-4bit

$ optiq serve \<br>--model ./humanizer-1B-OptIQ-4bit \<br>--adapter ./humanizer-1B-OptIQ-4bit/adapters/humanizer-sft \<br>--adapter...

humanizer optiq adapter adapters base human

Related Articles