Phlox — A full-featured, self-hostable AI platform
GitHub
Self-hostable · Open source · Apache 2.0
A full-featured AI platform<br>you actually own.
Phlox is a self-hostable AI platform — an agentic tool-using harness, document RAG,<br>code execution, an OpenAI-compatible gateway, and per-user cost accounting — running<br>over any model provider: AWS Bedrock or any OpenAI-compatible<br>endpoint, including fully local models.
View on GitHub
Quick start →
Runs over
AWS Bedrock
OpenAI
Ollama
OpenRouter
vLLM
LiteLLM
LM Studio
localhost:5173
~40built-in agent tools
100%self-hostable & offline-capable
8built-in themes
6+model providers, one config
Bring your own model. Or run it all locally.
Named provider profiles cover AWS Bedrock and any<br>OpenAI-compatible endpoint — OpenAI, LiteLLM, or a local runtime. Point Phlox at<br>Ollama, LM Studio, or vLLM and the whole stack — chat and RAG embeddings —<br>runs offline with no cloud API key. Switch profiles live, with a built-in connection<br>tester.
Define as many provider profiles as you like, switch between them instantly
Local embeddings (e.g. nomic-embed-text) keep RAG fully offline
Edit profiles, pricing, and limits live — no server restart required
config.yml<br>Copy
default_profile: local-ollama<br>profiles:<br>local-ollama:<br>type: openai<br>label: "Ollama (local)"<br>endpoint: http://localhost:11434/v1<br>api_key: ollama # ignored by Ollama<br>model: qwen3.6:35b<br>supports_tools: true
Everything in one app<br>A complete assistant, not just a chat box
Phlox bundles the pieces you'd otherwise stitch together yourself — each one self-hosted and under your control.
💬
Streaming chat
Conversation history with rename, delete, search & export. Edit/regenerate messages, markdown with highlighted, copyable code, plus Mermaid diagrams and LaTeX math.
🤖
Agentic harness
The model uses tools in a loop — filesystem, shell, Python/Node execution, document search, plus planning, sub-agents, memory, and checkpoints — all in a sandboxed workspace.
🤝
Human-in-the-loop
Pause on sensitive tools, approve or deny, then resume. The run state is persisted, so approvals survive disconnects.
🧰
Code execution & artifacts
Run code with captured output and inline artifacts. A Workspace Files panel lets you browse and download everything the agent created.
📚
Documents & RAG
Upload PDF, DOCX, TXT, MD, or code. Hybrid dense + sparse search over Qdrant with reranking and citations, scoped globally or per conversation. Works offline.
🌐
Opt-in web search
A per-prompt composer toggle exposes web_search (zero-config ddgs or SearXNG) so the agent can discover current sources before fetching pages.
🧠
Cross-conversation memory
Durable facts are saved and semantically recalled across chats, so the assistant remembers you from one conversation to the next.
🖼️
Multimodal
Attach images to messages for vision models, persisted and replayed into the provider as image content parts.
🔌
MCP integration
Connect Model Context Protocol servers from the UI; their tools join the model's toolset automatically, no code required.
🚪
OpenAI-compatible gateway
Mint per-user API keys and call Phlox from any OpenAI SDK via /v1/chat/completions — with the same per-user cost accounting.
💵
Usage & cost accounting
Per-message token and cost in the UI, plus an admin chargeback view by month × user × department × model, with CSV export for finance.
🎨
Theming
Phlox Dark by default, with Light, Fred Hutch, Hutch Night, Sandstone and more — instant switching via a CSS-variable token system.
The agentic core<br>A real agent, not "chat that calls tools"
Each turn, the model works in a loop — calling tools, planning, and recovering — inside a per-conversation sandboxed workspace you can inspect, snapshot, and roll back.
01 Tool loop
Filesystem (read_file, write_file, edit_file, glob, grep), run_shell, execute_python / execute_node, and search_documents — one unified tool surface the model drives until the task is done.
02 Planning & sub-agents
update_todos keeps a visible plan; spawn_subagent runs a nested, ephemeral agent with a scoped toolset in the same workspace and returns a report.
03 Memory & checkpoints
save_memory persists durable facts across chats. Every workspace is a git repo that auto-snapshots after mutating tools, with one-click restore.
04 Approvals & permissions
Every tool has an auto / ask / deny policy. The loop pauses on ask, persists its state, and resumes statelessly after you decide.
User message
Agent loopprovider.stream → tool_calls
read / write / edit
run_shell
execute_python
search_documents
web_search
spawn_subagent
Permission gate auto · ask · deny
Final answer + artifacts
Knowledge & memory<br>Your documents, searched the right way
Upload PDFs, Office docs, markdown, or source code. Phlox parses, chunks, and<br>embeds them into Qdrant , then retrieves with true hybrid search —<br>a dense semantic vector and a sparse lexical vector per chunk, fused with RRF...