What drawing lines on a football pitch taught me about the future of human-AI collaboration | KAY SINGHForeword: Before we begin, I have to say that you gotta trust me. I’m a normal sports fan and don’t overanalyze the heck out of a goal like this. Honestly, I’m not even sure if I should be writing this :)
Trust me, my first reaction was: WOW! What a goal!<br>When I looked at the 12 second mark in the above clip, I understood how this goal was scored (like every other armchair football analyst).<br>For those not football familiar, here’s what happens: Marquinhos blocks the initial shot attempt. Then, if you look closer 0:12 onwards you see Marquinhos’ initial momentum makes him rock back by a few inches giving Luis Díaz an opening to cut right and create a new shooting opportunity.<br>Loading post...
TLDR;#<br>Here is the final result of how the goal was scored. Marquinhos rocks back ~0.6 ft, Luis Díaz cuts right ~8.9 ft resulting in the shot window opening up from ~5.5ft/8.6° to ~7.5ft/22.0°.<br>The part I keep thinking about is this: if I had let Codex fully guide the project, I would have ended up with a confident but wrong result. My football intuition caught things that Codex did not know to care about. The full story of how we got there is below.<br>The Long Version#<br>For everyone else continuing on this ride, let’s talk about the rest of the story.<br>Can Codex solve this?#<br>I gave Codex the screenshots of the play and described what I wanted. It thought other angles might help, so I pulled YouTube screenshots from a few alternate camera angles seen below.<br>After a bit of back and forth, I decided that the following two frames best represented the change in distances we wanted to calculate (player movement, goal window) so I asked Codex to use these as the primary angle and other angles for validation.
Codex did what AI tools are very good at: it turned a fuzzy idea into a working direction. It suggested ways to measure distances, picked a Python image-processing stack, started detecting points, and generated annotated outputs. The first version looked convincing which I later found out was a problem.<br>The problem Codex created#<br>I do not know much about computer vision, but I do know football. In the first attempt, Codex claimed that the “goal window became 2.3x wider” which did not pass the smell test for me. I manually compared the open goal window to the right of the goalkeeper in pixels and it definitely didn’t seem like the new window was 2.3x wider.<br>Out of curiosity, I asked Codex to “show its work” by “visualizing” how it came up with the distance calculations for my review. As I looked at the following annotated images, I quickly found out how off the calculations were. Codex tried to find goal posts, player positions, field lines, and distances but each measurement looked erroneous.
Some points were not exactly on the bottom of the goal posts. Some foot markers were close, but not close enough. The shot cone angle points were not in the expected places. This was a problem because if the goal post marker was five pixels off, or the player foot marker was placed on a shadow instead of the boot, the final number inherited that error.
What I discovered was that agents can spend a lot of time “working” but it doesn’t reduce the hallucination tendencies.<br>Learning 1: Human domain expertise still matters#<br>I asked Codex whether using the known football dimensions would simplify the measurement problem. Specifically, a regulation goal is 24 feet wide, 8 feet high. Similarly, the six-yard box, penalty area and penalty mark also have known dimensions. If the broadcast frame showed the goal mouth clearly, maybe we could use the known goal width to calibrate the shot window. Codex agreed that these would be “very useful” but I was left wondering why it had not suggested these originally.<br>Learning: The first insight came from me and not AI. I used my domain expertise to think through the known knowns to calculate the unknowns.<br>Learning 2: Human-in-the-loop still matters#<br>After repeatedly prompting Codex again and again various versions of “Figure out better ways to detect objects more accurately” I had an epiphany. Why was I building a fully automated pipeline? Why couldn’t I provide “human judgement” to make the detection better?<br>The automated pipeline kept placing the left goal post marker a few pixels inside the post rather than at the base. It was a small error that compounded through every downstream calculation. I figured if I could just click where I knew the base was, we could skip the guesswork entirely. So, I asked Codex if I could mark the important points and objects in the images used for calculations.<br>Codex built a manual distance workbench using HTML/CSS where I could load the frame, zoom in, place small points, drag endpoints, mark things as approved, and save the review...