I keep pushing the frontier models to the limits and have several projects they still can’t solve, I benchmark new models on. Every new model make it easier to “solve” the even harder problems, but still I have this feeling that they rely 99% on my ideas. They just don’t get the ideas and I have to hold their hand and help them.Don’t get me wrong, anything that is close to done already they excel at and can combine existing techniques. I’m talking about new ideas models have never seen before.Example I have this hobby project that push what’s possible with route optimization. Yes it’s close to SOTA and way more efficient than all (?) other solutions out there (punnerud.github.io/mpee/), but I have to hold the model in the hand and brainstorm ideas on how to compress a matrix.And it’s just not a one time thing, happens like 40-50 times in few days.The 1% there is this “new ideas” part. Why can I come up with all these, and not the model? A really hard reval to create. Now this project is open, later I am thinking about making a frontier project in the same way, keeping it away from the public and using it as a benchmark. It’s that the best way to test for new ideas in models?