Zen and the Art of Machine Learning Research

jxmorris121 pts0 comments

Zen and the Art of Machine Learning Research

Token for Token

SubscribeSign in

Zen and the Art of Machine Learning Research<br>temperament over talent

Jack Morris<br>Jun 15, 2026

Share

So you want to do AI research? It’s true that no one really teaches you how. Not directly, anyway. But it turns out that the way to get started is pretty simple: some combination of (i) reading and (ii) building stuff. You can’t do one without the other. You become a researcher through the combination.<br>It turns out the process of becoming a great researcher is not unlike learning to meditate:<br>I.<br>The way to get started is pretty simple, through some combination of<br>(a) reading and learning, and<br>(b) building stuff.<br>You can’t only do one. You’ll become a researcher through this combination.<br>Thanks for reading Token for Token! Subscribe for free to receive new posts and support my work.

Subscribe

There’s an old Zen saying that goes something like this –<br>on days we find insight, we sit.<br>on days we do not find insight, we sit.

Doing research is basically like this. Scientific insights can come seemingly at random. Most days they will not come. An important trait for success is just putting in the time & effort. Like any other pursuit (music, sports, sales, etc.), if you want to become world-class, it will take a tremendous amount of discipline.<br>Noam Shazeer makes a nice hat-tip to the inherent randomness of successful research ideas in the SwiGLU paper:<br>“We offer no explanation as to why these architectures seem to work; we attribute their success, as all else, to divine benevolence.”<br>A related comment is that it’s possible to read too many papers. If you want to solve a problem, the tried-and-true path to success is to attempt a solution, try it, reach a bottleneck, try to solve it, and only reach for literature when you’ve run out of ideas yourself.<br>II.<br>Fine, but what should I work on?<br>If you’re just starting out, here’s my honest answer: I don’t think the exact topic matters much.<br>That said, I would warn you against choosing things that have been popular for less than six months. AI moves fast, but the fundamental ideas haven’t changed in forty years. If you want to make a career out of this, I wouldn’t advise you to think too hard about the concepts of 2026: harnesses, agents, context engineering, etc. These will change.<br>Instead, you’ll learn more by going back to the basics: learn what cross-entropy is. Compute it by hand for a small distribution. Deeply understand SVD, to the point where you can start to visualize it in your head. Don’t think too much about RL for coding specifically, instead learn the ideas behind policy gradients, why they’re useful, and why they’ve been popular for decades.<br>One more meta-comment: if the best possible outcome of your research project is a higher score on an existing benchmark, you are not going deep enough. Often, existing datasets won’t test new interesting capabilities.<br>Jason Wei makes a similar point:<br>An underrated but occasionally make-or-break skill in AI research (that didn’t really exist ten years ago) is the ability to find a dataset that actually exercises a new method you are working on.

As for a concrete suggestion, I can’t make one; that has to come to you. Go deep, focus on the basics, and don’t chase benchmarks. Stay in the water and the ideas will come.<br>III.<br>in the beginner’s mind there are many possibilities; in the expert’s mind there are few<br>– Suzuki

Something often-repeated in Silicon Valley these days is how experience in AI research might actually be counterproductive to good research intuition in the modern day. I’ve observed parts of this up-close; many researchers from the pre-scaling-era remain interested in designing methods that work at a small scale but will obviously fail when tested at scale.<br>One really impressive thing about OpenAI is that most of the people running the company (on the technical side, at least) are under 35. Many of the important decisionmakers behind chatGPT are under 30. One thing we can take away from this is that since AI is such a nascent field (chatGPT is less than four years old!) no one has a huge advantage, because no one has been working on it for very long.<br>In short, holding on to ideas for too long can actually be counterproductive. Stay open-minded and refuse to let ego cloud your judgement.<br>IV.<br>Inspiration strikes when you least expect it.<br>Here are two examples from history:<br>The discovery of the structure of the benzene ring famously came in a dream: the structure had never been seen before, but was imagined as a snake biting its own tail.

Ozempic basically comes from lizards. The GLP-1 hormone it mimics was first found in the venom of the Gila monster, a desert lizard that eats just a few times a year. Somehow we figured out how to make this work for humans too.

One important takeaway is that to do good research, you must do things other than research. Most of my personal “aha moments” happened away from the keyboard,...

research ideas learning work from token

Related Articles