Anthropic to introduce AI Fluency scorecard in Claude
Log in<br>Subscribe
Log in<br>Subscribe
ChatGPT
Gemini
Perplexity
Claude
Grok
Copilot
Mistral
Anthropic appears to be turning its February research project into a consumer-facing product. References to a new AI Fluency surface have been spotted inside Claude’s settings, where users will be able to open a dedicated screen and ask Claude to generate a personal AI fluency scorecard. The system is designed to scan a user’s activity across Chat, Cowork, and Claude Code sessions, score each session against a defined set of behavioral indicators, and produce a structured report once analysis completes, viewable and managed directly from the settings panel.<br>New research: The AI Fluency Index.
We tracked 11 behaviors across thousands of https://t.co/RxKnLNNcNR conversations—for example, how often people iterate and refine their work with Claude—to measure how well people collaborate with AI.
Read more: https://t.co/g65nGQFmjG<br>— Anthropic (@AnthropicAI) February 23, 2026
The scorecard evaluates eleven observable behaviors grouped around competencies that map closely to the 4D AI Fluency Framework Anthropic built with academics Rick Dakan and Joseph Feller. The themes covered include setting the goal and approach, framing the conversation, and applying quality control, broadly the delegation, description, and discernment pillars of that framework. Early signals suggest the result is presented as a fraction, for example, 7.5 out of 11, alongside guidance on which areas a user might strengthen, giving newcomers a concrete sense of where their habits with Claude are paying off and where they aren’t.<br>A sample visualisation based on the AI Fluency system promptThis is the logical next step after the AI Fluency Index Anthropic published in February 2026, which analyzed around 9,830 anonymized Claude conversations to baseline how people collaborate with AI today. That study found iteration and refinement to be the strongest predictor of good AI use, while polished outputs like artifacts and code tended to lower critical checking. Bringing the same scoring system into the product turns a research finding into a personal feedback loop, one that nudges users toward the behaviors Anthropic believes lead to safer outcomes.
AI Fluency system prompt
"Please generate a structured AI Fluency scorecard that evaluates how effectively I interact with AI across 11 behavioral indicators, based on the user messages provided below.\n\nThese messages are drawn from 45 conversations across 42 chat, 2 CoWork, and 1 Claude Code sessions. Each message is tagged with its surface — [chat], [cowork], or [cc].\n\nAnalyze the user messages to determine each indicator's status:\n- Use \"demonstrated\" ([+]) for indicators where the user clearly and consistently demonstrates the skill.\n- Use \"partial\" ([~]) for indicators where the user sometimes demonstrates the skill or does so imperfectly.\n- Use \"not-observed\" ([-]) for indicators where there is no evidence of the skill in the provided messages.\n\nFor every indicator marked [+] or [~], include 1-2 evidence quotes taken VERBATIM from the provided messages. Keep quotes under 150 characters each. Do NOT fabricate or invent quotes — every quote must appear exactly as written in the provided messages. If a quote must be shortened to fit the limit, truncate naturally at a word boundary.\n\nFor every indicator (regardless of status), output a Surfaces line listing which surfaces ([chat], [cowork], [cc]) the supporting evidence came from. If status is [-], output \"Surfaces: none\". Fluency looks different across surfaces: coding surfaces ([cowork], [cc]) favor concise delegation; [chat] favors rich description. Weight Description indicators primarily against [chat] messages.\n\nBase your assessment solely on the provided messages. Do not assume skills that are not evidenced.<br>>>> User chat transcripts are injected here<br>## The 11 Indicators\n\nA single terse message can genuinely demonstrate multiple indicators at once. \"ELI5\" specifies both an audience (#2: a beginner) and a format (#3: simplified explanation). \"less corporate\" is both tone (#4) and implicit audience (#2). When a message packs multiple signals, credit each indicator it demonstrates — do not force it into only the single most-obvious row. The bar for each is still \"clearly demonstrated\", not \"plausibly related\".\n\n### Delegation\n- 0: Clarifies goals — Does the user state what they want to accomplish before requesting help?\n- 1: Consults on approach — Does the user ASK which approach to take before requesting execution? Interrogative: \"what's the best way to approach this?\", \"how should I structure this?\". The user is seeking a recommendation, not yet committed to a direction. Distinguish from #7: #1 asks which approach, #7 directs how Claude behaves.\n\n### Description\n- 2: Defines audience — Does the user specify who the output is for?\n- 3: Specifies...