A Thousand Inertial Samples in One Edge - by Jaimin
Atoms to Algorithms
SubscribeSign in
A Thousand Inertial Samples in One Edge<br>Thursday, May 28, 2026 · Perception
Jaimin<br>May 28, 2026
Share
A Unitree G1 humanoid strides across a warehouse aisle at 1.4 m/s. The motion sensor on its wrist (an inertial measurement unit, or IMU) is reporting acceleration and rotation a thousand times a second. The stereo cameras on its head are producing fresh images ten times a second. The robot’s brain has to combine those two streams into one continuous answer to “where am I and how am I moving.” Done naively, the math melts under the rate mismatch. The whole modern visual-inertial stack survives because of one clever trick from a 2012 paper out of Sydney, refined in 2015 and 2017 by a team across Zurich and Atlanta. The trick collapses a hundred IMU samples into a single edge in the robot’s map. That edge is what we are walking through today.
Last post we closed with the robot’s SLAM graph snapping a three-meter drift back to centimeter consistency through one loop-closure correction. Today is the other half of the same math: how the IMU’s high-rate stream gets compressed so it can sit inside that graph without overwhelming it. Tomorrow we close Week 4 with the larger sensor-fusion mechanism, the math that decides which of the robot’s senses to trust at any given instant.<br>How it actually works
Think of the robot’s IMU as a passenger in a car with their eyes closed, jotting down “we just turned a bit left, we just sped up, we just hit a bump.” Done at a thousand samples a second, that becomes a usable estimate of how the car moved between two checkpoints, as long as the passenger’s pencil-and-paper math is honest. The catch is that every IMU has small, drifting calibration errors (engineers call these biases), and any error in the bias gets multiplied through the integration. The robot’s brain wants to estimate those biases on the fly. But every time it changes its guess, all of the IMU integration since the last camera frame would have to be redone. With a hundred samples between frames, that loop is what makes a naive visual-inertial system too slow to run.
The 2012 insight, from Todd Lupton and Salah Sukkarieh, is to do the integration in the body’s own frame at the start of the window, not in the world frame. That subtle change makes the integral a relative measurement (how much did the robot move and rotate between camera frames i and j, as seen from frame i), and it makes the result independent of where the robot was when the window started. The 2015 and 2017 refinement by Christian Forster, Luca Carlone, Frank Dellaert, and Davide Scaramuzza does the same thing properly on the curved surface that 3D rotations live on, and adds one more trick: a precomputed mathematical object (a Jacobian) that lets the robot adjust its bias estimate without redoing the integration at all. When the optimizer wants to test “what if the gyroscope bias were 0.001 radians per second different,” it applies a small correction to the already-computed integral instead of replaying the hundred samples.<br>The result is that a window of one hundred IMU samples collapses into nine numbers and a small covariance matrix, all bundled into a single edge in the robot’s map between two camera frames. The optimizer treats it like any other measurement. The hundred-to-one compression is what makes modern visual-inertial odometry possible on a robot that is running everything else (vision, planning, control) on the same compute budget.<br>The 2026 wave is adding learning on top without changing this underlying math. New work from groups in 2025 and 2026 trains small neural networks to estimate the IMU bias more accurately from a window of recent samples, and then feeds that prior into the same preintegration factor. A March 2026 paper called the “Plug-and-Play Learning-based IMU Bias Factor” does exactly this. A May 2025 paper, “Learned IMU Bias Prediction for Invariant Visual Inertial Odometry,” takes it further by freezing the bias inside a special filter (an invariant Kalman filter) so the math stays consistent under the kind of high-acceleration moves a humanoid makes. And the biggest 2026 result, “Resilient odometry via hierarchical adaptation” out of Carnegie Mellon and published in Science Robotics, validates a stack across two hundred kilometers and eight hundred hours of operation on aerial, wheeled, and legged robots, with the IMU promoted from “helper sensor when vision fails” to a first-class member of the fusion stack.<br>New this week
There were three substantial 2026 results in this layer. The Science Robotics paper from CMU is the headline: a fielded, multi-robot result that puts the IMU on equal footing with cameras and LiDAR, with a learned inertial branch that carries the trajectory when the exteroceptive sensors fail. The “Plug-and-Play Learning-based IMU Bias Factor” arXiv paper provides a drop-in improvement to existing VIO...