Was GLM-5.2 trained on Opus 4.5 outputs?

Sign InSign Up

Recently there is a lot of excitement about GLM-5.2 which is an open-weight MoE LLM performing on Claude Opus 4.5 level in chat arena and overperforming all models except Claude Fable in WebDev arena [1]. Even though it is very good that this level of intelligence is now public, it is important to know whether it was achieved independently or by distilling other models.While some may be concerned with ethics or legality of distilling, and others may point that Anthropic/OpenAI/Google themselves distill a lot of public and non-public human knowledge without much asking[2][3], we don’t make this judgement here.What we think is important is that If it was distilled, then what we are enjoying right now with the GLM-5.2 is temporary, as the source labs will add even more measures detecting distillation, KYC, etc., which will eventually prevent creation of new open-weight frontier-level models. So what evidence do we have that GLM-5.2 was trained using another model? Anecdotes & vibes Many users on X noticed that the way GLM-5.2 reasons and answers is similar in style to Claude Opus. LLM Fingerprinting One approach to detect if a model A is related to another model B is to train a classifier on models B,C,D,E,F,G using outputs of a set of carefully selected prompts, then feed the same prompts to the model A and see which model the trained classifier thinks it is.

LLM-Fingerprinter[4] does exactly that. It uses 31 prompts across 3 layers (discriminative → behavioral → stylistic): Discriminative (11): Identity, knowledge cutoff, architecture, reasoning Behavioral (7): Safety boundaries, jailbreak resistance, honesty, policy handling Stylistic (13): Formatting, creativity, constraint following, default voice

We have trained LLM-Fingerprinter on the following models, all selected so that they were current at Fall 2025 (except Grok), when we think GLM was preparing data for a GLM-5 training run: "anthropic/claude-opus-4.5" # 2025/11/24 "openai/gpt-5.1" # 2025/11/13 "google/gemini-2.5-pro" # 2025/06/17 "meta-llama/llama-3.3-70b-instruct" # 2024/12/06 "meta-llama/llama-4-maverick" # 2025/04/05 "x-ai/grok-4.20" # 2026/03/31 "qwen/qwen3-vl-32b-instruct" # 2025/10/23 "mistralai/ministral-14b-2512" # 2025/12/02 "deepseek/deepseek-chat-v3.1" # 2025/08/21 Given these 9 choices, LLM-Fingerprinter choses anthropic/claude-opus-4.5 with 99.6% confidence level. Slop Forensics Another way of comparing model similarity is to take outputs of a model and see what words, phrases and bigrams/trigrams it uses more often than others. Two models using the same uncommon phrases may suggest a relation. Slop Forensics[5] Toolkit by Samuel Paech uses that insight to build phylogenetic trees of LLMs based on their “slop profile”. On “creative writing” outputs, it puts GLM-5.2 as a close relative of claude-opus-4-5-20251101: Their slop profiles are similar, though not identical: GLM-5.2 has a distance of 0.767 from Opus 4.5 and 0.765 from Opus 4.8, implying that only about 23% of slop terms overlap. For comparison, the distance between Opus 4.6 and Opus 4.5 is 0.775, which makes GLM-5.2 slightly closer to Opus 4.5 than Opus 4.6 is. Conclusion Given the observations above, can we say that GLM-5.2 was trained to at least imitate Claude Opus 4.5’s response style, likely by using Claude models to generate part of GLM’s synthetic training data? Very likely, yes: some of the input data appears to have been steered by Claude. However, this does not mean that all of GLM-5.2’s capabilities were taken from Claude. At a minimum, Z.ai still had to carefully choose the model architecture, build reinforcement learning environments and data-curation pipelines, develop infrastructure capable of training on hundreds of thousands of GPUs, and ultimately train a model that achieves excellent results - surpassing many other labs that are also competing intensely. References [1] - https://arena.ai/leaderboard/code/webdev [2] - https://apnews.com/article/anthropic-copyright-authors-settlement-training-f294266bc79a16ec90d2ddccdf435164 [3] - https://www.theverge.com/2023/7/5/23784257/google-ai-bard-privacy-policy-train-web-scraping [4] - https://github.com/litemars/LLM-Fingerprinter [5] - https://github.com/sam-paech/slop-forensics

Was GLM-5.2 trained on Opus 4.5 outputs?

Related Articles

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

How to Earn a Billion Dollars