Two LLM UI Patterns That Aren't Chat

2026-05-25 Mon 09:46

article

llm

publish

Intro

Chat is still the default LLM interface, and for most cases that's fine. Agentic harnesses are still built around a single linear conversation at their core. Some LLM tasks are better represented as structured context than as messages. This post looks at two patterns: comparison as a table, and exploration as a tree.

I've included my explorations here, along with some live example apps I put together using shelley on exe.dev.

Comparison

Link: Comparitable

Asking an LLM to compare things in chat quickly becomes tedious. At first you get a decent table, but as you ask follow-up questions, the useful information gets split across multiple answers, or the LLM tries to redraw the table every time. Adding another item makes the regeneration problem worse.

The table is already the thing you want, with items as rows, questions as columns, and answers as cells.

I explored a hybrid chat/table interface where new questions create new columns and added items create new rows. In practice it feels like chatting with a spreadsheet.

Building the Table

After entering a topic, the app searches for relevant items, fetches their pages, and fills the first version of the table. For "ultralight 1-person tents" that might look like:

+----------------------+----------+-------+-------------------+ | Item | Weight | Price | Wall Type | +----------------------+----------+-------+-------------------+ | Zpacks Plex Solo | 405 g | $599 | Single-wall DCF | | Big Agnes Fly Creek | 879 g | $350 | Double-wall | | Tarptent ProTrail Li | 425 g | $399 | Single-wall DCF | | Nemo Hornet Elite | 765 g | $450 | Double-wall OSMO | +----------------------+----------+-------+-------------------+ The app currently uses Kagi search to gather items, but the same approach would work well embedded in a shopping site, marketplace, internal product database, recruiting tool, or anywhere else the rows already exist.

The initial columns are also generated from the first search results. The app looks at the retrieved items and picks a few useful dimensions automatically, so there is already something on the table before the user asks anything.

Questions Become Columns

Once the table exists, typing a question adds a new column to it. Each question adds one comparison dimension to the existing structure, and the result has a specific place to land.

The model call is ordinary: given these item summaries, answer this question for each row. What changes is that the result lands in the structure you are already building.

Creating the LLM call as a tool call which specifically returns an answer for each row also fits the current generation of models well.

Prompting ExampleThe user prompt looks roughly like:

User You are filling in a column in a comparison table.

Question: "is it freestanding?"

Items:

zpacks-plex-solo: [summary]

big-agnes-fly-creek: [summary]

tarptent-protrail-li: [summary]

nemo-hornet-elite: [summary]

Use the provided summaries. Answer "Unknown" if the information is not available.

And the tool definition:

"name": "fill_column", "description": "Answer a comparison question for each item in the table.", "input_schema": { "type": "object", "properties": { "answers": { "type": "array", "items": { "type": "object", "properties": { "row_id": { "type": "string" }, "value": { "type": "string" } }, "required": ["row_id", "value"] }, "required": ["answers"]

The important constraint is simple: return exactly one cell value for each row id. The model fills a specific part of the table.

Not Just Specs

Because a model is filling the cells, the columns can go beyond structured specs.

Translation is a natural side effect. If an item's page is in Japanese and I want the table in English, the cell gets filled in English without a separate translation mode.

Unit normalization is similar. If I ask Weight (g)? the model can usually get everything into the same unit, assuming the source pages have enough information.

Soft judgement calls also work better than I expected. "Is this good for a beginner?" is a useful comparison question even though it has no single correct answer, and most real buying decisions are similar.

For questions that need fresher context, the app can search again before filling the column. "What does Reddit think about this?" becomes a column backed by live search.

The product comparison case is the obvious one, but the same interaction pattern could apply anywhere you are evaluating a set of things against a shared set of questions.

Decomposition

Link: Breakdowner

Some tasks require exploring many different branches of context. What begins as a linear topic often spawns sub-topics that each warrant their own focused exploration, and in a single chat interface these bleed together and poison the context, making it hard to go deep on any one thread.

The natural shape for this kind...

Two LLM UI Patterns That Aren't Chat

Related Articles

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy

SpaceX not the behemoth everyone thought