The Plumbing That Decides a Machine Learning Competition | by Alan Scott Encinas | Jul, 2026 | MediumSitemapOpen in appSign up<br>Sign in
Medium Logo
Get app<br>Write
Search
Sign up<br>Sign in
The Plumbing That Decides a Machine Learning Competition
Alan Scott Encinas
8 min read·<br>6 hours ago
Listen
Share
This is a Builder Journal entry written from inside a live machine-learning competition I am actually competing in. Not a tidy after-the-fact write-up where I already know how it ends. It runs long and it shows code, because it covers the part most write-ups skip: everything that has to be true before a model is even allowed to matter. You get where I was, with the parts that are still my edge kept dark.<br>In the movie, the Predator does not see the jungle the way you do. It reads heat, flips to another band, and hunts a signature the human eye throws away. I have spent the last few weeks building something with the exact same job, except my target is real and my footage comes from a sensor that records colors your retina was never built to receive.<br>The competition is the Hyperspectral Object Tracking Challenge 2026, run on Kaggle alongside an academic hyperspectral-imaging conference. The task sounds simple and refuses to be. You get one object, boxed in the first frame of a video, and you have to find that same object in every frame after it. Frame one, here is the target. Frame two through the end, where did it go. Single-object tracking is old and well-studied for normal video. What makes this one different is the camera. A normal frame gives you three numbers per pixel, red, green, blue. A hyperspectral frame gives you sixteen, or twenty-five, each a thin slice of the spectrum, and some of those slices sit past the edge of anything a human can register. A target and its background can look identical to you and be screaming in a band you have never seen.<br>You are scored two ways at once: how tightly your predicted box overlaps the real one, and how close the center of your box stays to the target as it moves. One number rewards the shape of the guess. The other rewards never losing the thing. The podium sits around 0.56. I came into this stretch with a baseline of 0.524, about three and a half hundredths under the line, close enough to be maddening.<br>So I expected the hard part to be the tracking. I was wrong. The hard part, this whole stretch, was everything that has to be true before a model is even allowed to matter. Two of those things are worth your time, and here they are with the work shown.<br>The leaderboard that won’t tell you your score<br>Picture a final exam where you hand in your answers and the professor never returns a grade. Three guesses for the whole term, one sealed result at the very end. Study as hard as you want. You will not learn whether you were right until it no longer matters.<br>That is the leaderboard I am climbing. The test set they score you on is unlabeled. You get the first-frame box and nothing else, no answer key, no ground truth. The public leaderboard is the only mirror, and submissions are rationed, so you glance at the speedometer twice a day and drive the rest of the race blind. Develop that way long enough and you are not engineering anymore. You are guessing with extra steps.<br>There is a way out, and building it was the most important thing I did all stretch. The training data is labeled. Four hundred and five sequences, every frame boxed, ground truth included. So you stop chasing the leaderboard and build your own scorer at home, one that grades against that labeled set as many times as you want, for free.<br>The work was not writing the scorer. The work was proving it matched. A homemade metric that disagrees with the official one is worse than nothing, because it lies to you with total confidence while you make decisions on top of it. So before I trusted a single number it gave me, I rebuilt the official metric from the ground up and made the two agree exactly.<br>The shape of it is two measurements per frame, averaged. Overlap, and how far the center drifted.<br>def iou(box_a, box_b):<br>"""Overlap of two [x, y, w, h] boxes, 0 (miss) to 1 (perfect)."""<br>ax, ay, aw, ah = box_a<br>bx, by, bw, bh = box_b<br>ix1, iy1 = max(ax, bx), max(ay, by)<br>ix2, iy2 = min(ax + aw, bx + bw), min(ay + ah, by + bh)<br>inter = max(0.0, ix2 - ix1) * max(0.0, iy2 - iy1)<br>union = aw * ah + bw * bh - inter<br>return inter / union if union > 0 else 0.0
def center_error(box_a, box_b):<br>"""Pixel distance between the two box centers."""<br>ax, ay, aw, ah = box_a<br>bx, by, bw, bh = box_b<br>cax, cay = ax + aw / 2, ay + ah / 2<br>cbx, cby = bx + bw / 2, by + bh / 2<br>return ((cax - cbx) ** 2 + (cay - cby) ** 2) ** 0.5A single sequence is those two, combined. Overlap says the box is the right size and in the right place. Precision, the fraction of frames where the center stayed inside a tolerance, says the tracker never wandered off the target entirely. A clip can score well on one and badly on the other, and the combined number refuses to...