Ably Realtime | HTTP streaming and AI
DocsDocumentationExamples
Ask AILoginStart free
svg]:rotate-90 data-[state=open]:border-b data-[state=open]:sticky data-[state=open]:top-0 h-12 px-4 py-3 font-bold" data-radix-collection-item="">Platform
svg]:rotate-90 data-[state=open]:border-b data-[state=open]:sticky data-[state=open]:top-0 h-12 px-4 py-3 font-bold" data-radix-collection-item="">Ably Pub/Sub
svg]:rotate-90 data-[state=open]:border-b data-[state=open]:sticky data-[state=open]:top-0 h-12 px-4 py-3 font-bold" data-radix-collection-item="">Ably Chat
svg]:rotate-90 data-[state=open]:border-b data-[state=open]:sticky data-[state=open]:top-0 h-12 px-4 py-3 font-bold text-neutral-1300 dark:text-neutral-000" data-radix-collection-item="">Ably AI Transport<br>svg]:rotate-90 font-medium rounded-lg" data-radix-collection-item="">Overview<br>svg]:rotate-90 rounded-lg text-neutral-1300 dark:text-neutral-000 font-bold" data-radix-collection-item="">Why AI Transport<br>svg]:rotate-90 font-medium rounded-lg border-l border-neutral-300 dark:border-neutral-1000 hover:border-neutral-500 dark:hover:border-neutral-800 rounded-l-none" data-radix-collection-item="">Overview<br>svg]:rotate-90 rounded-lg border-l dark:border-neutral-1000 hover:border-neutral-500 dark:hover:border-neutral-800 rounded-l-none text-neutral-1300 dark:text-neutral-000 font-bold border-orange-600 bg-orange-100 hover:bg-orange-100" data-radix-collection-item="">HTTP streaming and AI
svg]:rotate-90 font-medium rounded-lg" data-radix-collection-item="">Concepts
svg]:rotate-90 font-medium rounded-lg" data-radix-collection-item="">Getting started
svg]:rotate-90 font-medium rounded-lg" data-radix-collection-item="">Frameworks
svg]:rotate-90 font-medium rounded-lg" data-radix-collection-item="">Features
svg]:rotate-90 font-medium rounded-lg" data-radix-collection-item="">Going to production<br>svg]:rotate-90 font-medium rounded-lg" data-radix-collection-item="">API reference
svg]:rotate-90 font-medium rounded-lg" data-radix-collection-item="">Internals
svg]:rotate-90 data-[state=open]:border-b data-[state=open]:sticky data-[state=open]:top-0 h-12 px-4 py-3 font-bold" data-radix-collection-item="">Ably Spaces
svg]:rotate-90 data-[state=open]:border-b data-[state=open]:sticky data-[state=open]:top-0 h-12 px-4 py-3 font-bold" data-radix-collection-item="">Ably LiveObjects
svg]:rotate-90 data-[state=open]:border-b data-[state=open]:sticky data-[state=open]:top-0 h-12 px-4 py-3 font-bold" data-radix-collection-item="">Ably LiveSync
Looking for machine-readable content?<br>View this page as Markdown<br>Browse all documentation pages (llms.txt)<br>Tip: Request pages with Accept: text/markdown header or use a recognized LLM user agent to receive markdown directly.
HTTP streaming and AI<br>Direct HTTP streaming is fine for one-off interactions and breaks down everywhere else. These are the four limitations that show up once an AI app is in production.<br>Open in
Most AI frameworks support a simple client-driven interaction: the client makes an HTTP request, an agent handles it, and the response streams back to the client over Server-Sent Events or a similar HTTP stream. The pattern is simple, surprisingly effective for one-shot interactions, and every framework supports it. The simplicity of the pattern is also the source of its limitations.
The limitations below arise from coupling the client-to-agent interaction to the transport that carries it. The connection, the request, and the streamed response are all the same lifetime: they exist for one interaction, between one client and one agent. Anything that requires the interaction to outlive the connection (or be visible to anything other than that one client) requires building new infrastructure on top.
Streams fail on disconnection
The operation of a response stream is tied to the health of the underlying connection. When the connection drops, the response stream fails.
This happens routinely. A phone switches from Wi-Fi to cellular. A user refreshes the page. A laptop lid closes mid-response. The LLM continues to generate tokens, and there is nowhere to deliver them.
SSE is the default streaming transport for most AI frameworks. The SSE protocol does include a mechanism for a reconnecting client to specify a position in the stream to resume from. In practice it is rarely supported, because supporting it adds significant backend complexity. To resume an SSE stream you assign sequence numbers to token events for ordering, buffer those events in an external store, and add a new HTTP endpoint to handle resume requests. That is a substantial departure from a stateless request handler. Even with the work done, resume only covers reconnection of an existing client; it does not cover continuity after a page refresh, because SSE has no built-in concept of session identity. Building that is yet another layer on top.
Sessions do not span devices
With HTTP streaming, the connection is exclusive to the requesting client and the...