Designing an AI-Native Technical Screen

Designing an AI-Native Technical Screen - Aniruddha

Aniruddha

SubscribeSign in

Aniruddha Jun 09, 2026

When I joined Pocus we ran a typical technical screen, two coding interviews back to back. As LLMs became central to how code gets written and understood, it became clear that our screening process was not filtering for the right candidates. Over and over, I'd watch strong candidates clear both rounds and still have no read on whether they could survive in code they didn't write. Everything around us was new and moving fast, so we went back to first principles. Technical screens have a constraint the rest of the loop doesn't: almost everyone goes through them, so they have to be cheap to run and high signal per minute. That also caps how open-ended they can be. The more a screen depends on interpretation, the less consistent it is across candidates and interviewers, and consistency is most of what a screen is for. First Principles

When AI writes most of the code, it's worth asking from scratch what changes for the person doing the engineering. A few things kept standing out. Engineers live in unfamiliar code now

The model makes it cheap to step outside your lane. A backend engineer ships a frontend change, a project needs a one-line fix in a service another team owns, something in a language you've never written needs a small adjustment and you do it anyway. The codebase you're fluent in is a shrinking fraction of the code you touch. Reading matters more than writing

The model generates faster than any human types, and most of what it produces is plausible. If you can't read code fluently enough to keep pace with what's being generated, you can't keep up with where the work is going. You're just approving things. Communication stopped being optional

As an industry we spent a long time treating it as the soft thing you could skip if someone could really code. That trade-off doesn't exist anymore. Telling a model what you want in plain language is not a different skill from telling a human. None of these is what a blank editor measures. The Interview

Here’s the shape it settled into. Candidates knew the terms up front: they'd work in a TypeScript codebase, in their own dev environment, and the code would be unfamiliar. The interview lasted 1 hour. At the start we shared a repository, a moderately-sized server that was more than a toy, and walked them through how it behaved when nothing was wrong, plus the scripts to compile and run it. Then they went bug-hunting. There were three bugs, increasing in difficulty. The first required almost no reading, a warm-up to get them into the repo. The second needed a small, localized change. The third needed real intuition about how the system fit together. None of them asked for a complex feature. They could use a model freely. Ask it about the codebase, about TypeScript, about the error in front of them. The one thing they couldn't do was ask it to find the bug. The moment you let someone hand the diagnosis to the model, you stop learning whether they can read and reason; you just watch the model work. An unfamiliar repo because that's the actual job now, their own environment because that's the real condition the job runs in, and no asking for the bug because the reading is the point. Run that in an hour, and the onsite is freed up to be the expensive part — actually building something with a model. Candidate Patterns

We rolled this out in late 2024, and I ran more than a hundred of them myself. A handful of patterns showed up across almost every candidate. The Skeptic

The interesting thing about the skeptic was never their opinion of AI. It was that their approach wouldn't bend when the problem outgrew it. They'd commit to reading every line by hand, which is admirable until the third bug, when the strategy that got them through the first two simply stops scaling and they keep running it anyway. Very few of them reached the end. The Executor

The executor used the model the way a harness uses a model: asking it for a fix, pasting the fix in, running it, and asking again when it didn't work. The trouble is that a harness usually has a way to verify the result, and this one didn't. They became the model's hands without being its judgment, churning through suggestions they couldn't evaluate, getting further from the bug with every confident wrong turn. The Deflector

Some candidates spent the interview litigating the interview itself. The question was unrealistic, the bugs were contrived, real work doesn't look like this, and here are the reasons, delivered at length, in the time that could have been spent finding the bug. Attacking the question instead of the problem is itself the signal. The Leetcoder

This was a surprisingly large number of people who had drilled the traditional loop so thoroughly that anything slightly out of band knocked them over. Strong on a clean algorithmic prompt, lost the moment the...

Designing an AI-Native Technical Screen

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy