DVD-JEPA — a world model that dreams a bouncing logo
Realityground truth
JEPA's expectationdecoded
Predictive surprise (reality vs. expectation)
surprise: —<br>⚠ ANOMALY DETECTED
The model's mind — 32-d latent z
mode: monitor
⏸ Pause<br>↻ New world<br>⚡ Inject anomaly<br>💭 Dream 30 steps ahead
Decoder
speed
Tip: turn the Decoder off to see what a pure JEPA actually gives you —<br>just the 32 latent bars. It understands the bounce perfectly and refuses to draw it. Turn it<br>back on to render the dream. Hit Inject anomaly to teleport the logo and watch<br>the surprise meter spike.
01 / predict
Future in latent space
The predictor steps one tick forward as a vector, not a picture. Trained to match an EMA<br>target encoder's embedding of the real next frame — the core JEPA objective.
02 / render
The optional decoder
A pure JEPA has no decoder. Bolt one on and the latent dream becomes pixels — turning the<br>model into a future-frame video predictor you can actually watch.
03 / detect
Surprise = anomaly
When reality stops matching the rendered expectation, prediction error spikes. That's a<br>usable anomaly signal — the same job a real egocentric-video world model does.