To the moon?
Future Telescope 37
I started writing this newsletter when I was thinking about AI more than I was thinking about anything else. I then stopped writing this newsletter for a while because I was thinking more about other things. These days I’m back to thinking more about AI again, and boy, do I have some thoughts.
One thing is becoming clear - excellence or being close to the 100th percentile in human intelligence related to a field, is becoming disproportionately valuable and will continue to be so for the foreseeable future. Most of the compute in early LLMs was spent on pre-training once it got established as a scaling function. Most of the compute these days is spent on post-training, which means large RLVR labs are hard at work. This means more and more economically valuable human activities are being turned into RL data for models which makes the models smarter and, in combination with agentic architectures, more suited to complete specialized tasks. The point is to get so good at specialized tasks that at some point the possibility of asymptotically being generalizable becomes more and more likely, and then is eventually achieved.
An additional indication in this direction is Dario Amodei’s claims across time - he said in Jan 2025 that “Importantly, because this type of RL is new, we are still very early on the scaling curve”, continuing on to say that companies were getting significantly larger gains by spending $1M on RL than by spending $0.1M. But, he said that scaling this further was to be expected, and he reinforced this in a Feb 2026 conversation saying that just like pre-training scaled to the moon in the last decade, scaling RL to the moon is the key to further progress in this decade. Alibaba backed Minimax M2.5 also reported using scaled RL as a significant factor in its quick growth on performance.
What does this mean for you? If you are at a median level of intelligence in your given pursuit, it would probably mean that at some point a model will get RLVR’d into being at least more intelligent than a median human at the given task, and thus will render you unable to compete with artificial intelligence to justify your economic value.
On the other hand we may find that, as François Chollet says, “Sufficiently advanced agentic coding is essentially machine learning…”, which according to him also implies that agentic systems will soon become prey to issues like overfitting, data leakage, concept drift, etc.
Either way, it is likely that for people close to the 100th percentile in a field, AI won’t be as hard a competitor, because either its architecture will fall apart due to issues inherent to ML, or that the RLVR approach just won’t scale to the 100th percentile human being due to lack of enough verifiable rewards. So put yourself in a place where you are close to the top percentile, and you are more likely to benefit disproportionately compared to previous times in history.



