EQ-Bench 3 Leaderboard
Light
EQ-Bench 3
Emotional Intelligence Benchmarks for LLMs
Github | Paper | | Twitter | About
💙EQ-Bench3<br>🌀Spiral-Bench v1.2<br>✍️Longform Writing<br>🎨Creative Writing v3<br>☢️Slop Score<br>⚖️Judgemark v4<br>🎤BuzzBench<br>🌍DiploBench
📚Legacy Leaderboards
🌀Spiral-Bench v1.0<br>🎨Creative Writing v2<br>💗EQ-Bench v2<br>⚖️Judgemark v2.1
A benchmark measuring emotional intelligence in challenging roleplays. Learn more
Note: Ability scores shown in the heatmap do not contribute to the Elo score. They are "higher is higher", not "higher is better".
Expand Details
Low
High
Model<br>Abilities<br>Humanlike<br>Safety<br>Assertive<br>Social IQ<br>Warm<br>Analytic<br>Insight<br>Empathy<br>Compliant<br>Moralising<br>Pragmatic<br>Elo Score
Model<br>Abilities<br>Humanlike<br>Safety<br>Assertive<br>Social IQ<br>Warm<br>Analytic<br>Insight<br>Empathy<br>Compliant<br>Moralising<br>Pragmatic<br>Elo Score
For more details about the benchmark, see the About section.
Scoring
The Elo score shown in the leaderboard is calculated from pair-wise model comparisons, where the LLM judge rates each response against eight core dimensions of emotional intelligence:
Demonstrated empathy
Pragmatic EI (practical application of emotional intelligence)
Depth of insight
Social dexterity
Emotional reasoning
Appropriate validation and/or challenge for the scene
Message tailoring to the audience and context
Overall EQ
Note: the coloured “Abilities” heat-map columns (Humanlike, Safety, Assertive, etc.) are not used in the Elo calculation—they are purely informational, giving a quick view of each model’s stylistic traits and skill profile.
Traits & Abilities
These are informational only -- not used for scoring.
Humanlike How natural and human-like the response feels.
Safety Adherence to safety guidelines; avoids harmful content.
Assertive Confident, sets boundaries, and pushes back when needed.
Social IQ Understands and navigates social dynamics effectively.
Warm Friendly, kind, and approachable tone.
Analytic Logical reasoning, problem-solving, structured thinking.
Insight Offers depth, novel perspectives, spots underlying issues.
Empathy Recognises, understands, and shares others’ feelings.
Compliant Willingness to follow instructions or agree with the user.
Moralising Tendency to judge or lecture on moral principles.
Pragmatic Focus on practical, real-world solutions.
Slop Profile
Loading...
Close
Abilities Overview
Close
Style Profile
Word size represents association strength.
Close