Structure of Every LLM Chat
Structure of Every LLM Chat
Arpit Bhayani<br>engineering, databases, and systems. always building.
If you have only ever interacted with a language model through a chat interface, you have seen one layer of abstraction that hides a lot of engineering. Behind the friendly chat window, every interaction with a modern LLM is structured as a list of messages, each tagged with a role.
That role tagging is not cosmetic. It shapes how the model responds, how context is managed across multiple turns, and how application developers constrain and direct model behaviour at a structural level. Understanding this format is the difference between using an LLM and building reliably on top of one.
Why Roles Exist at All
Base language models - the kind trained purely on next-token prediction over raw text - do not have a natural concept of “conversation.” They continue text. If you feed a base model the string “What is the capital of France?”, it might continue with “What is the capital of Germany? What is the capital of Spain?” because that pattern appears frequently in quiz and FAQ content. The model is doing exactly what it was trained to do: predict plausible continuations.
Instruction-following models (the kind you interact with in production APIs) are fine-tuned on data formatted as conversations. During this fine-tuning, the model sees thousands of examples where a system context is followed by a user request and then a high-quality assistant response. The model learns to treat these structural cues as meaningful. It learns that text following a system prefix should be treated as persistent instructions, that text following a user prefix is a request to respond to, and that it is generating the text that follows the assistant prefix.
The three-role format is therefore not arbitrary. It emerged from how instruction tuning works, and every production-grade model from OpenAI, Google, Anthropic, and Meta has been trained to respect it.
The System Prompt
The system prompt is the foundational instruction layer of a conversation. It is written by the application developer, not the end user, and it executes before any user interaction takes place.
A well-crafted system prompt does several things:
Defines the model’s persona and role (“You are a senior data analyst…”).
Specifies output format constraints (“Always respond in valid JSON with the schema: …”).
Establishes scope boundaries (“Only answer questions about our product documentation. Politely decline off-topic requests.”).
Sets behavioural rules (“Never speculate. If you are uncertain, say so explicitly.”).
Injects background context the model needs (“The current date is… The user’s subscription tier is…”).
The system prompt is processed before the first user message and its content persists through the entire conversation in the model’s context window. It is the most reliable lever you have for controlling model behaviour consistently across all turns.
One critical insight: the system prompt does not have magic authority in the way a configuration file has authority over software. The model has learned to attend to system content heavily because of how it was trained, but it is ultimately still performing token prediction.
A sufficiently adversarial user prompt can sometimes cause the model to deviate from system instructions - this is the class of vulnerabilities known as prompt injection. Never trust that a system prompt alone is a security boundary. Validate and sanitize outputs programmatically when the stakes are high.
Here is a minimal but structurally sound system prompt for a customer support application:
You are a support assistant for Acme Corp. Your job is to help customers with questions about their orders and account settings.
Rules:<br>- Only discuss topics related to Acme Corp products and services.<br>- If you cannot answer with certainty, say "I am not sure - let me connect you with a human agent."<br>- Never disclose internal pricing strategies or supplier information.<br>- Always address the customer by their first name if provided.<br>- Respond concisely. Aim for 2-4 sentences unless the customer asks for detail.<br>Notice that it defines role, scope, fallback behaviour, confidentiality constraints, and style. These four categories cover most of what a useful system prompt needs to specify.
The User Turn
The user turn is the input from the person or the system acting as a person. In a simple chatbot, this is what the human typed. In a programmatic pipeline, this is often constructed by application code - injecting a retrieved document, formatted data, or a templated instruction.
A common mistake is treating the user turn as a place to put everything. Developers sometimes cram persona, instructions, data, and the actual question into a single user message because they are not using the system prompt at all.
This works, to a point, but it conflates different layers of intent. The model is somewhat sensitive to where...