When a Kaggle Submission Is Rejected Before It Scores | by Alan Scott Encinas | Jun, 2026 | MediumSitemapOpen in appSign up<br>Sign in
Medium Logo
Get app<br>Write
Search
Sign up<br>Sign in
When a Kaggle Submission Is Rejected Before It Scores
Alan Scott Encinas
3 min read·<br>2 hours ago
Listen
Share
You train for months. You walk up to the start line. And before the gun even fires, a judge pulls you off the track because your bib number has the wrong count of digits. That is almost exactly how my first submission to this competition went.<br>This is the third entry in a log I’m keeping while I compete. The model is the glamorous part. This one is about the unglamorous part that almost sank me twice before any model got judged at all.<br>A quick reset<br>If you’re just joining: I’m tracking a single object through hyperspectral video for the Hyperspectral Object Tracking Challenge 2026, footage shot in bands the human eye cannot register. Everyone pictures the hard part as the tracking. The hard part, this week, was getting a single valid file accepted.<br>The template that lied<br>Kaggle hands you a template file, the sample submission, that shows the exact shape your answer must take: one row for every frame you are supposed to predict. The obvious move is to match it exactly, so I did. My file had 26,760 rows, the same as the template. Rejected. The scorer wanted 26,860. The template was a hundred rows short.<br>It was stale. The sample had been generated against an older cut of the data and never refreshed. Three sequences in the real set had more frames than the template admitted. One of them, a clip of jellyfish shot in near-infrared, ran ninety-eight frames longer than the file claimed. Two others were off by a single frame each. I had trusted the convenience file, and the convenience file was wrong.<br>The fix was to stop trusting it. Instead of copying the template’s row counts, I counted the actual image frames sitting on disk and built the submission from those. 26,860 rows. Accepted. The lesson is older than machine learning: when the map disagrees with the territory, believe the territory.<br>The data didn’t want to be downloaded<br>Getting the data at all was its own small siege. The dataset lives on Google Drive, and Drive does not want you pulling dozens of large files in a row. The usual download tool died after about thirty-five files with a “too many accesses” slap on the wrist. The way through was a different download endpoint, hit directly, which Drive does not throttle the same way. And a few of the infrared sets were not files at all but folders, which failed in their own special manner and had to be pulled through a browser, in pieces.<br>None of this is in a paper. Nobody writes one.<br>The brutal arithmetic<br>Here is why it matters anyway. A flawless tracker that emits 26,760 rows scores exactly zero. Not a low score. Zero. The plumbing is not beneath the work, the way people quietly assume it is. The plumbing is half the work, and it is the half that decides whether the other half is ever allowed to count.<br>A results piece skips all of this and shows you the winning number. A log tells you that the number nearly didn’t exist, because a stale CSV and a download limit stood between me and the scoreboard, and both had to be beaten before a model could even be wrong.<br>Where I’m standing right now<br>0.524 on the board. Under the podium line, but on the board, which a row-count bug nearly prevented. The baseline I keep quoting only exists because I lost an afternoon to a hundred missing rows and won it back.<br>With the data in hand, the submission finally scoring, and a local harness I can trust, the project stops being about setup and starts being about the gap. Three and a half hundredths to the podium is the whole game from here. That chase is where this log goes next.<br>More in this series This is part of an ongoing builder’s log written from inside live competitions. You’re reading where I was, not where I am.
Originally published at https://alanscottencinas.com on June 26, 2026.
Machine Learning
Kaggle
Computer Vision
Data Science
Hyperspectral Imaging
Written by Alan Scott Encinas<br>13 followers<br>·18 following
AI Engineer & Systems Architect. I turn complex ideas into working systems: cognitive AI, autonomous systems, robotics, defense.
Help
Status
About
Careers
Press
Blog
Store
Privacy
Rules
Terms
Text to speech