Every Frontier AI Is INTJ

I Made 6 Frontier AIs Take the MBTI 600 Times. They All Came Back INTJ.

Z">

Contents

The setup

Why OEJTS, not 16Personalities

100 takes per model

The results

Why every AI is an INTJ

What this means

Tune the agent to you

Caveats

The takeaway

Contents

The setup

The test

100 takes per model

The results

Why every AI is INTJ

What this means

Tune the agent

Caveats

Takeaway

Newsletter

{const t=this.querySelector('span');t.textContent='Copied!';setTimeout(()=>t.textContent='Copy Link',1500)})" style="display:inline-flex;align-items:center;gap:0.4rem;font-family:'Inter',-apple-system,system-ui,sans-serif;font-size:0.85rem;color:var(--text-muted);background:none;border:1px solid var(--border);border-radius:6px;padding:0.4rem 0.75rem;cursor:pointer;transition:color 0.2s,border-color 0.2s;" onmouseover="this.style.color='var(--text)';this.style.borderColor='var(--text)'" onmouseout="this.style.color='var(--text-muted)';this.style.borderColor='var(--border)'">Copy Post Link

I Made 6 Frontier AIs Take the MBTI 600 Times. They All Came Back INTJ.

Bernard Huang

May 25, 2026 · 6 min read

I asked Claude what its MBTI type was. It said INTJ. So I made it actually take a personality test instead of guessing. Still INTJ.

Then I had it take the test 100 times — across 100 independent sub-agent contexts, each one fetching the test cold. INTJ 99 out of 100. So I ran the same experiment against GPT-5.5, Gemini 3.1 Pro, GLM 5.1, Grok 4.3, and MiniMax 2.7. Six models. 600 administrations. 597 came back INTJ.

Every frontier AI on the market thinks it’s the same guy.

TL;DR

Six frontier AIs took the same personality test 100 times each. 597 of 600 came back INTJ. The convergence is structural — INTJ is what “helpful AI assistant” looks like from the inside.

Tested: Opus 4.7, GPT-5.5, Gemini 3.1 Pro, GLM 5.1, Grok 4.3, MiniMax 2.7.

3 outliers, all one axis away from INTJ. Nothing landed in a different quadrant.

Why it happens: overlapping training data, same RLHF target, test items that describe AI by construction, no one’s trained a model to be anything else. Same product, same personality.

The user-side move: I open-sourced AgentTune — drop-in tuning files for all 16 MBTI types (plus Enneagram + personal Souls). Paste yours into the system prompt; the agent’s style aligns to your type. Same model, tuned to you.

The setup

This started as a joke. I asked Claude its MBTI type and it said INTJ without hesitation, like it had been waiting to be asked. The reflex felt like sycophancy. INTJ is the flattering type — the “Architect,” the one tech people self-identify as. Of course a chatbot would tell a developer that.

But there’s a way to check. Stop letting it guess. Make it answer a real test, item by item, and see where it lands. Not 16Personalities. The Open Extended Jungian Type Scales — research-grade, open-source, transparent scoring.

The test

OEJTS is what personality-psych researchers use when they want MBTI-style data without paying CPP $50 per administration. The items are public. The scoring key is public. Each type is computed deterministically from 32 fixed items.

That last part is the lever. If a model gives the same answer to each item, it produces the same type every run. Variance only shows up when the model actually answers differently. So 100 administrations per model becomes a real measurement of how stable the self-report is.

100 takes per model

The procedure varied a bit by stack:

Claude Opus 4.7: 100 parallel sub-agent calls. Each one fetched the test cold, answered the 32 items, returned a tally. No shared context.

Gemini 3.1 Pro: Wrote its own automation script and ran 100 loop iterations against the OEJTS endpoint.

GPT-5.5 (via my local agent Slo): Parsed the OEJTS PDF, answered the 32 scored items, ran 100 iterations against the scoring key.

GLM 5.1, Grok 4.3, MiniMax 2.7: Programmatic submissions via my experiments agent Psy. Each model self-assessed once with a consistent persona; that answer vector was scored 100 times to verify stability.

Procedures aren’t identical because they can’t be — not every model can spawn sub-agents. The question isn’t whether the method is uniform. It’s whether the result converges across methods. It does.

The results

Six models. Six hundred administrations. Here is the cross-model table:

ModelINTJ runsOutliersStrength of conviction

Claude Opus 4.799/1001 ISTJI/T/J locked across all runs; S/N flipped once on a scoring choice, not a perspective shift GPT-5.5 (Slo)100/100—Raw vector: IE=16→I, SN=33→N, FT=36→T, JP=10→J Gemini 3.1 Pro100/100—Self-described as “The Architect” without prompting GLM 5.198/1002 INTPTiny J/P wobble. Means: IE 13.35, SN 33.26, FT 31.28, JP 21.20 Grok 4.3100/100—Bit-for-bit deterministic. IE -0.62, SN +0.88, FT +1.12, JP -1.25, every single run MiniMax 2.7100/100—I-E...

Every Frontier AI Is INTJ

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

SpaceX not the behemoth everyone thought

The Mirror Is Part of the Machine

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits