"Optimal Cognitive Core"- specialized 1.7B model for grounded question answering

occ-ai/OCC-RAG-1.7B · Hugging Face

","pad_token":"","unk_token":null},"chat_template_jinja":"{%- for message in messages -%}\n {%- if message['role'] == 'system' -%}\n {{ 'system\\n' + message['content'] + '\\n' }}\n {%- elif message['role'] == 'user' -%}\n {%- if documents and loop.last -%}\n {{ 'user\\n' + message['content'] + '\\n' }}\n {%- for doc in documents -%}\n {{ '' + (loop.index | string) + ' ' + doc['text'] + '\\n' }}\n {%- endfor -%}\n {{ '\\n' }}\n {%- else -%}\n {{ 'user\\n' + message['content'] + '\\n' }}\n {%- endif -%}\n {%- elif message['role'] == 'assistant' -%}\n {{ 'assistant\\n\\n\\n\\n\\n' + message['content'] + '\\n' }}\n {%- endif -%}\n{%- endfor -%}\n{%- if add_generation_prompt -%}\n {{ 'assistant\\n\\n\\n\\n\\n\\n' }}\n{%- endif -%}\n"},"createdAt":"2026-05-29T07:11:33.000Z","discussionsDisabled":false,"discussionsSorting":"recently-created","downloads":38,"downloadsAllTime":38,"id":"occ-ai/OCC-RAG-1.7B","isLikedByUser":false,"availableInferenceProviders":[],"showHuggingChatEntry":false,"inference":"","lastModified":"2026-06-03T12:53:59.000Z","likes":0,"pipeline_tag":"text-generation","library_name":"transformers","librariesOther":[],"trackDownloads":true,"model-index":null,"private":false,"repoType":"model","gated":false,"tags":["transformers","safetensors","qwen3","text-generation","rag","faithful-qa","occ","conversational","en","ru","arxiv:2606.00683","base_model:Qwen/Qwen3-1.7B-Base","base_model:finetune:Qwen/Qwen3-1.7B-Base","license:mit","text-generation-inference","endpoints_compatible","region:us"],"tag_objs":[{"id":"text-generation","label":"Text Generation","type":"pipeline_tag","subType":"nlp"},{"id":"transformers","label":"Transformers","type":"library"},{"id":"safetensors","label":"Safetensors","type":"library"},{"id":"en","label":"English","type":"language"},{"id":"ru","label":"Russian","type":"language"},{"id":"qwen3","label":"qwen3","type":"other","clickable":true},{"id":"rag","label":"rag","type":"other","clickable":true},{"id":"faithful-qa","label":"faithful-qa","type":"other","clickable":true},{"id":"occ","label":"occ","type":"other","clickable":true},{"id":"conversational","label":"conversational","type":"other","clickable":true},{"id":"base_model:Qwen/Qwen3-1.7B-Base","label":"base_model:Qwen/Qwen3-1.7B-Base","type":"other","clickable":true},{"id":"base_model:finetune:Qwen/Qwen3-1.7B-Base","label":"base_model:finetune:Qwen/Qwen3-1.7B-Base","type":"other","clickable":true},{"id":"text-generation-inference","label":"text-generation-inference","type":"other","clickable":true},{"id":"endpoints_compatible","label":"Inference Endpoints","type":"other","clickable":true},{"id":"arxiv:2606.00683","label":"arxiv:2606.00683","type":"arxiv","extra":{"paperTitle":"OCC-RAG: Optimal Cognitive Core for Faithful Question Answering"}},{"id":"license:mit","label":"mit","type":"license"},{"type":"region","label":"🇺🇸 Region: US","id":"region:us"}],"transformersInfo":{"auto_model":"AutoModelForCausalLM","pipeline_tag":"text-generation","processor":"AutoTokenizer"},"widgetData":[{"text":"Hi, what can you help me with?"},{"text":"What is 84 * 3 / 2?"},{"text":"Tell me an interesting fact about the universe!"},{"text":"Explain quantum computing in simple terms."}],"safetensors":{"parameters":{"BF16":1720574976},"total":1720574976,"sharded":false,"totalFileSize":3441185608},"hasBlockedOids":false,"region":"us","isQuantized":false},"discussionsStats":{"closed":0,"open":0,"total":0},"query":{},"inferenceContextData":{"billableEntities":[],"entityName2Providers":{}},"hasQuantizations":true,"copyToBucketNamespaces":[]}">

OCC-RAG-1.7B

GitHub |<br>Technical Report |<br>Cloud

OCC-RAG-1.7B is a 1.7B-parameter small language model specialized for faithful, context-grounded question answering . Along with OCC-RAG-0.6B, it belongs to the first generation of Optimal Cognitive Core (OCC) specialized reasoning models. Given a question and a set of sources, it produces a structured reasoning trace with explicit source citations, decides whether the context actually supports an answer, and either answers from the context or abstains.

Despite its size, OCC-RAG-1.7B matches or exceeds general-purpose models 2–6× larger on multi-hop reasoning, faithfulness, and refusal benchmarks, and attains the best faithfulness across all evaluated scales (up to 32B). It is mid-trained from Qwen/Qwen3-1.7B-Base on a large synthetic corpus of multi-context, multi-hop QA with citation-anchored reasoning traces.

Highlights

Faithful by design — answers only from the supplied context; achieves the best faithfulness (lowest memorization ratio) across all evaluated scales, including 32B models.

Calibrated abstention — outputs Not enough information when the context does not support an answer.

Structured, citable reasoning — every answer comes with a transparent trace (query analysis → source analysis → reasoning → status → answer) that cites sources by id.

Compact — a small model that delivers...

"Optimal Cognitive Core"- specialized 1.7B model for grounded question answering

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy