Graphify โ Open-Source Knowledge Graph Skill for AI Coding Assistants
๐<br>in
EN<br>ZH-CN<br>ZH-HK<br>ZH-TW<br>KO<br>VI
Deploy Your OpenClaw Online<br>Click to Deploy Now ->
Open-Source Knowledge Graph Skill<br>Graphify โ Knowledge Graphs<br>for AI Coding Assistants
Graphify is an open-source skill that helps AI coding assistants understand multi-modal codebases by building a queryable knowledge graph from code, docs, papers and diagrams.
pip install graphifyy<br>Copy
What is Graphify?
Graphify is a multi-modal knowledge graph builder created for AI coding assistants such as Claude Code, OpenAI Codex and OpenCode. By combining Tree-sitter static analysis with LLM-driven semantic extraction, Graphify turns an entire repository โ including source code, documentation, research papers and diagrams โ into an interactive graph that explains both what the code does and why it was designed that way. The project is maintained by Safi Shamsi, released under the permissive MIT license, and built on widely-trusted libraries including NetworkX and Tree-sitter.
3.7k+ GitHub Stars
MIT License
71.5ร Token Reduction
Python 3.10+ Runtime
Core Capabilities
Graphify unifies static analysis, semantic extraction and graph clustering into a single skill that any AI coding assistant can invoke.
Multi-Modal Extraction<br>Parses code (.py, .js, .go, .java, โฆ), Markdown, PDFs and images. Tree-sitter extracts ASTs, call graphs and docstrings; LLMs extract concepts from prose; vision models read diagrams.
Knowledge Graph Build<br>Merges all extracted nodes and edges into a NetworkX graph and applies the Leiden algorithm for semantic community detection โ no vector embeddings required.
God Nodes & Surprises<br>Identifies the highest-degree "god nodes" at the heart of the system and flags unexpected cross-file or cross-domain connections worth investigating.
Interactive Outputs<br>Exports an interactive graph.html, a queryable graph.json, and a human-readable GRAPH_REPORT.md audit report.
Assistant Integration<br>Ships with /graphify, /graphify query, /graphify path and /graphify explain commands for Claude Code, Codex, OpenCode and more.
Secure by Design<br>Strict input validation: only http/https URLs, size and timeout limits, path containment, HTML-escaped node labels โ defending against SSRF, injection and XSS.
Architecture & Pipeline
Graphify is a multi-stage pipeline. Each stage is an isolated module so contributors can extend any step independently.
detect โ collect files<br>extract โ AST + LLM nodes/edges<br>build โ NetworkX graph<br>cluster โ Leiden communities<br>analyze โ god nodes & surprises<br>report โ GRAPH_REPORT.md<br>export โ HTML / JSON / Obsidian
Supporting modules include ingest.py for URL fetching, cache.py for semantic caching, security.py for input validation, watch.py for live updates and serve.py for MCP-protocol service.
Install & Run
Graphify is distributed on PyPI. The package name is graphifyy; the CLI command remains graphify.
# Requires Python 3.10+<br>pip install graphifyy && graphify install
# Build a knowledge graph for any project folder<br>/graphify ./raw
# Outputs land in graphify-out/<br>graphify-out/<br>โโโ graph.html # interactive visualization<br>โโโ GRAPH_REPORT.md # core nodes, surprises, suggested questions<br>โโโ graph.json # persistent, queryable graph<br>โโโ cache/ # incremental cache<br>Graphify does not bundle an LLM. It uses the model API key already configured by your AI coding assistant (Claude, Codex, etc.) and only sends semantic content โ never raw source code โ to the upstream model.
Worked Examples
The repository ships with reproducible corpora demonstrating Graphify on both small libraries and large mixed code-and-paper collections.
httpx (small)
6 Python files modeling an HTTP transport layer. Result: 144 nodes, 330 edges, 6 communities . God nodes: Client, AsyncClient, Response, Request. Surprise edge: DigestAuth โ Response.
Karpathy mixed corpus
3 GPT framework repos + 5 attention papers + 4 diagrams (~52 files, ~92k words). Result: 285 nodes, 340 edges, 53 communities . Average query cost ~1.7k tokens vs ~123k naive โ a 71.5ร reduction.
Comparison
How Graphify relates to adjacent open-source projects in the code-intelligence space.
ProjectFocusStrengthLimitation vs Graphify
SourcegraphCross-repo code searchEnterprise-grade navigationNot a knowledge graph; limited design semantics<br>Code2VecFunction-level embeddingsVector retrieval & classificationNo graph structure, no multi-modal input<br>Neo4jGeneral graph databasePowerful Cypher queriesDoes not generate graphs from code itself
Security, Licensing & Trust
Graphify is released under the MIT License . Its core dependencies โ NetworkX (BSD) and Tree-sitter (MIT) โ are all permissive open-source licenses with no conflicts. The project performs no telemetry. The only outbound network call is the semantic-extraction step, which uses your own configured AI model API key; only semantic descriptions of documents are transmitted, never raw source code. URLs are restricted to...