GitHub - aranajhonny/omnipod: Chat with Podcast Transcripts. ยท GitHub
/" data-turbo-transient="true" />
Skip to content
Search or jump to...
Search code, repositories, users, issues, pull requests...
-->
Search
Clear
Search syntax tips
Provide feedback
--><br>We read every piece of feedback, and take your input very seriously.
Include my email address so I can be contacted
Cancel
Submit feedback
Saved searches
Use saved searches to filter your results more quickly
-->
Name
Query
To see all available qualifiers, see our documentation.
Cancel
Create saved search
Sign in
/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up
Appearance settings
Resetting focus
You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.
Dismiss alert
{{ message }}
aranajhonny
omnipod
Public
Notifications<br>You must be signed in to change notification settings
Fork
Star
main
BranchesTags
Go to file
CodeOpen more actions menu
Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit
History<br>25 Commits<br>25 Commits
core
core
data/transcripts
data/transcripts
lex_podcast/data
lex_podcast/data
.gitignore
.gitignore
README.md
README.md
app.py
app.py
download_transcripts.py
download_transcripts.py
ingest.py
ingest.py
requirements.txt
requirements.txt
View all files
Repository files navigation
๐๏ธ OmniPod
Chat with 936 podcast episodes. Every answer cites its source.
Ask "What did Karpathy say about neural networks?" โ get an answer with the exact transcript chunk it came from. No hallucinations. No guessing.
Why OmniPod?
Most RAG chatbots hallucinate. You ask about a podcast, they invent quotes.
OmniPod doesn't. Every response is grounded โ verified against the actual transcript before it reaches you. If the source doesn't support the answer, it says so.
Three query types, one pipeline:
Type<br>Example<br>Strategy
Factual<br>"What did Huberman say about sleep?"<br>Retrieve โ Generate โ Verify
Synthetic<br>"Compare AI safety views across guests"<br>Map-Reduce โ Deduplicate โ Synthesize
Generative<br>"Write an essay on consciousness from these episodes"<br>Plan โ Draft โ Ground
How it works
You ask a question<br>โโโโโโโโโโโโโโโ<br>โ Router โ classify_intent() โ routes to the right handler<br>โ LRU cache โ avoids re-embedding repeated queries<br>โ Semaphore โ caps concurrent LLM calls at 5<br>โโโโโโโโฌโโโโโโโ<br>โโโโโโโโโโโโโโโ<br>โ Retrieval โ bge-small-en-v1.5 (384d) โ Qdrant cosine<br>โ 19,140 โ chunks from 936 Lex Fridman episodes<br>โ chunks โ Guest filtering via known-guests index<br>โโโโโโโโฌโโโโโโโ<br>โโโโโโโโโโโโโโโ<br>โ Generate + โ DeepSeek V4 Flash via OpenCode API<br>โ Verify โ verify_groundedness() โ rejects ungrounded answers<br>โโโโโโโโฌโโโโโโโ<br>Cited answer in Chainlit UI (localhost:8000)
60-second setup
.env<br>docker run -d --name qdrant -p 6333:6333 qdrant/qdrant<br>python ingest.py --rebuild<br>chainlit run app.py<br># โ http://localhost:8000">git clone https://github.com/aranajhonny/omnipod && cd omnipod<br>python3.13 -m venv .venv && source .venv/bin/activate<br>pip install -r requirements.txt<br>echo "OPENCODE_API_KEY=sk-your-key" > .env<br>docker run -d --name qdrant -p 6333:6333 qdrant/qdrant<br>python ingest.py --rebuild<br>chainlit run app.py<br># โ http://localhost:8000
Numbers that matter
Metric<br>Value
Episodes indexed<br>936 Lex Fridman
Chunks<br>19,140 (512 chars, 128 overlap)
Embedding dim<br>384 (bge-small-en-v1.5, MPS GPU)
Query embedding<br>~100ms
Vector search<br>~50ms (cosine, 19K points)
Full answer<br>~2s on M1 Pro
Full ingest<br>~8 min
Codebase<br>1,138 lines Python, 9 files
Transcript scraper included
No YouTube API key needed. Two sources:
lexfridman.com โ scrapes official transcript pages (requests + BeautifulSoup)
YouTube โ uses free proxy at youtubetranscript.pro for auto-captions
cd lex_podcast<br>pip install requests beautifulsoup4<br>python run.py pipeline # scrapes all 936 episodes
Output lands in data/transcripts/.
Example queries
"What did Andrej Karpathy say about neural networks?"<br>"Compare views on AI safety across all guests"<br>"Write a short essay on human consciousness based on these episodes"<br>"Summarize what Andrew Huberman says about sleep"
Architecture decisions
Why bge-small-en-v1.5? 384-dim embeddings are fast to search and good enough for conversational podcast text. Runs locally on MPS GPU.
Why Qdrant over Chroma? Cosine search at 19K points in ~50ms. Filterable by guest metadata out of the box.
Why intent routing? Factual, synthetic, and generative queries need fundamentally different retrieval and generation strategies. One prompt fits all fails at scale.
Why groundedness verification? LLMs default to confident BS. verify_groundedness() forces the model to check its answer against the retrieved context before showing it to the user.
License
MIT
About
Chat with Podcast Transcripts.
Resources
Readme
Uh oh!
There was an error...