Thoughts on the Near Future

alecco1 pts1 comments

46 thoughts on the near future | XCancel

46 thoughts on the near future

bayes

@bayeslord<br>12h

22

50

580

98,409

This list is based on a thread I posted on June 4th. A few edits and additions here and there. Several people asked me to make the thread easier to read, so here it is.

Intelligence

I think people are going to be blindsided by algorithmic progress. The entire world, markets, governments, militaries, companies, people, etc. are all trying to make sense of AI and its impact in terms of the recent past’s production efficiencies and regularities, and how things appear to be going. Even several of the purportedly “RSI”pilled neolabs seem to think this will be business as usual but with Agent in a loop. No. My guess is there are many algorithmic OOMs left to go in the production of intelligence, maybe (maybe) up to ten, with four to seven seeming more likely. Going beyond even ten is possible in principle, but it strains hard against what I suspect the universe will actually let us do. Implausible but not impossible. If this is true then things aren’t actually going as they appear to be going and a big jump is coming. Anything along these lines happening would make things, far weirder than almost anyone seems to be pricing in.

We are in early takeoff. AI improving AI may end up being one of the most consequential steps of history. This isn’t certain because we don’t know how far from the physical and computational limits of intelligence we are, though I would bet it’s quite far from where we are today (as I said above, ~4-10 OOMs more intelligence output per unit of scale seems possible).

Now that we’re in takeoff, algorithmic research is accelerating. Compute is still a scarce resource, but researcher-time opportunity costs are lower because you can just send an agent on any quest or wild goose chase. It might come back with something. All new ideas come with optimization debt that can now be paid in unsupervised token spend. Vast numbers of research scaling law curves will be traversed.

AI models, especially the frontier, will keep getting better. The only true wall is physics. Models are increasingly autonomous, smart, and are getting better all the time. Math and code are falling to scale+RL, everything else is up next. Verifiable vs. non-verifiable as a meaningful distinction will fade. Automated AI research and AI learning are going to look more and more related as we go forward. Training models well is closely related to models learning well in general. Sample efficiency, creativity, and all other limitations will be solved and then start approaching algorithmic optimality at whatever scale.

The idea that long horizon agents always need equivalently long horizon training is wrong because generalization in time exists. Long tasks are not made of longness! This is related to LeCun’s fallacy of (1-e)^n error accumulation. What’s actually going on is error correction. This happens at multiple scales from the single token generation level up to steps in a long task. Part of the reason the METR graph goes up is that agents are starting to hit error correction escape velocity.

An engineering-grade science of deep learning is imminent. This will drive us to AI algorithmic maturity much more rapidly than people are expecting, though as I mentioned above it’s not clear how far this can go even in principle. For example, a science of scale-invariance dramatically increases the scale and returns of useful experimentation because experiments on one GPU can tell you how to use one hundred thousand.

There will be Move 37 moments for every domain of technical human endeavor and then, quite quickly, Move 37s will seem quaint. I mean for everything.

Compute is going to keep improving. Today’s best matmul machines are nowhere near the physical limits of AI accelerators. There’s a lot of room to get better at digital silicon. There are also many candidates for new substrates, and the algorithmic debt they owe will be automated to its limits, but we don’t yet know what the optimal one is for AI in space/energy/time/manufacturability/cost. Photonics and stochastic silicon are both interesting candidates, but I also expect the singularity to be surprising.

How far ahead the labs can get depends in part on the returns to automation and scale, which includes the returns to greater algorithmic depth. If deep learning practice (and theory) is forever shallow then the moat will mostly not be algorithmic on the longer term because secrets will be relatively cheap to discover. Eventually distillation + data + time can catch up to compute scale, potentially slowly. So far this seems partly where we’re at, but even if true there are no guarantees it will continue this way.

If things become less shallow as we scale then every increment of automation and scale buy you algorithmic secrets that are increasingly out of reach for anyone else. This too seems partly where we’re at. The end point in either case is...

algorithmic scale going because time near

Related Articles