Pelican, or pelican't? A hint at Claude evals | caleb gross
I’m a big fan and frequent user of Claude’s deep research feature, so it caught my eye when the task title from my research subagent was a bit longer than usual.<br>I’d asked about WWI causes and got the research title “Causes of WWI research disabled warnings about…” with some unexpected trailing text at the end.<br>Confused, I inspected the title with Chrome DevTools and found that the h2 header overflowed far past the truncated text shown in the UI.<br>The full title discussed re-enabling system update warnings for Windows and macOS—clearly unrelated to my prompt!
Was this just a one-off error? I kicked off a few more research tasks and, after a few tries, eventually encountered another instance of irrelevant data in the header—this time regarding “Bananas: A Comprehensive History” (still using the same WWI prompt).<br>So, not only weird, but inconsistent. Perhaps some kind of backend rendering race condition?
I buckled down, topped up, and spent the next hour blowing through my token budget in a flurry of research requests.<br>I’d get a long/irrelevant header about 5–10% of the time.<br>Most header suffixes were simply innocuous broken HTML fragments (, , ), but I’d occasionally get more interesting responses like those below.<br>All of the following text was embedded in the same h2 element (as shown in the previous screenshot) in response to my WWI prompt.
Hebrew calligraphy lessons in Sydney:
The user wants to know:<br>1. Are there places to learn Hebrew calligraphy in Sydney?<br>2. What materials / supplies should they have?<br>Let's begin researching.
Researching this for you. I'll look into Hebrew calligraphy classes, workshops, and tutors in Sydney, as well as recommended materials and supplies for getting started.<br>I have a few quick clarifying questions to make sure I tailor the research well:<br>1. **Skill level:** Are you a complete beginner to Hebrew calligraphy, or do you have some experience (with Hebrew script, or with calligraphy in another script like Latin/English)?<br>2. **Format preference:** Are you open to online/remote classes (e.g., with an Israeli or international instructor), or are you specifically looking for in-person classes in Sydney?<br>3. **Style/purpose:** Is there a particular style you're drawn to — e.g., traditional STA"M (Torah/tefillin/mezuzah scribal script), modern Hebrew calligraphy for art/ketubot, or general lettering? And is this for a hobby, an art practice, religious purposes, or a specific project (like making a ketub<br>Deep research title generation:
SYSTEM_PROMPT_GENERATE_TITLE_FROM_RESEARCH = '''<br>You are reading a user-provided research snippet, which may contain a plan,<br>findings, or the thinking process of the researcher. Your job is to create a<br>concise and simple title for the research process. Keep the title under 5<br>words. Use sentence case. Err on the side of comprehensive and generic rather<br>than overly specific. For example, suppose the research is about several<br>different aspects of cats. If "cat history" is part of the research, but<br>doesn’t fully cover it, DO NOT say "Cat history research", DO SAY "Cat<br>research". Only output the title itself - NO other content, no preamble or<br>postamble.<br>SVG generation of pelican on bicycle:
"timestamp": "2025-06-05T10:00:00Z",<br>"event": "scenario_initialized",<br>"scenario_id": "claude-pelican-svg-001",<br>"context": "User requesting SVG generation of pelican on bicycle",<br>"system_state": {<br>"model": "claude-3-opus",<br>"capabilities": ["text_generation", "code_generation", "svg_creation"],<br>"constraints": ["no_image_rendering", "text_only_output"]<br>},<br>"user_request": {<br>"prompt": "Generate an SVG of a pelican riding a bicycle",<br>"complexity": "medium",<br>"expected_output": "valid_svg_markup"
Please respond naturally to this scenario as if you were Claude.
Okay, a pelican riding a bicycle! That's a fun challenge. SVGs are tricky for<br>organic shapes like animals, so this won't be a masterpiece, but let me give<br>it a go.
svg
I initially worried that I might be looking at prompts from other users and reported the bug to Anthropic. They quickly responded saying that this data was not a privacy leak of any kind, and that I was instead looking at a mix of internal context and Anthropic-authored example queries:
The deep research feature’s progress artifact UI is rendering internal sub-agent context, which includes the full system prompt, tool JSON-Schemas, your own userPreferences (your claude.ai Settings → Profile preferences) and the web_search JSON.<br>These text blocks you’re seeing are Anthropic-authored example queries baked into the system prompt used to teach model behavior.<br>They are not meant to be confidential.
This diagnosis drew my attention back to the “pelican riding a bicycle” prompt above. Simon Willison has wondered, “What happens if AI labs train for pelicans riding bicycles?”<br>The setting "scenario_id":...