Modern CPU-s suck horribly. All we were able to do in 30 years is to finally switch to a weak memory model and partially reduced instruction set (AArch64 still has lots of complex instructions).
CPU-s internally haven’t been employing von Neumann architecture for a long time, they are all Haward-based with very separate code and data flows starting from L1c/L1D caches and all the way to ALU. CPU-s don’t perform calculations step-by-step — they do calculations concurrently (instructions pipeline, out of order execution, etc), but pretend those happened one by one. They pretend the memory is uniform, but RAM is effectively an external storage for the CPU; modern CPU-s actually boot up without a RAM available, working on caches and registers only ­— and so reading an uncached data from RAM is kinda like fetching the data from NVMe storage. CPU-s are synchronous (timing aware) and eventfull (they actually react). Fully asynchronous execution is so restrictive you cannot implement a consensus in it i.e. between multiple asynchronous runners. Developers tried to slap a lipstick onto it i.e. introduce interruptions, but without a paraigm change you just get yourself race conditions when an oblivious asynchronous code steps encounters data unexpectedly modified by a concurrent interrupt.

The revolution in programming was already happening, but the transformation was much more tragic than most of you imagine while reading IT news. The main engine of transformation was internet, huge archives of answers and boilerplates like stackoverflow and github. We used to employ google and others to look for the solutions, LLM brought it to another level, but the source data and the purpose did not change — and that’s important. When you reproduce same solutions over and over and over again, then the industry just hopelessly ossifies, smallest changes (like additional syntax sugar in a programming language) are viewed as a revolution because we need to retrain all the LLM-s to support the new syntax. Like std::map is still red-black binary tree despite the fact it’s cache-hostile for PC CPU-s of the last 30 years — and on interviews you can still encounter “write a binary tree” task for some web frontend vacancy…. in the era of persistent data structures and array mapped tries.

Now stackoverflow is dying and github is full of LLM-generated code. The worst thing here is that LLM-s now produce insane amount of essentially same content which is not only useless as training data, but rather actively harmfull:
https://arxiv.org/abs/2305.17493 - The Curse of Recursion: Training on Generated Data Makes Models Forget
https://arxiv.org/abs/2509.04796 - Knowledge Collapse in LLMs: When Fluency Survives but Facts Fail under Recursive Synthetic Training

LLM-s are just pattern matchers with interpolation — that’s by design, they cannot do anything else, their whole training is about “given this array of tokens remember the next token”. The main thing that made LLM-s and CNN-s (like Stable Diffusion) viable is a sheer computation power that GPGPU is capable of providing — and it’s still several times less than performance of ASIC or FPGA. When GPGPU can provide 50 TFlops, a CPU can do something like 0.5 TFlops on SIMD-only calculations — it’s actually much lower for non-SIMD code. So CPU-s are hundreds and thousands time slower than theoretical capability of modern transistors. GPGPU is not smarter — it’s just more performant, but the core “knowledge” lies in the internet, it’s not the AI itself that is smart.

And it’s the original sin that makes it impossible to achieve any decent performance on the CPU-s — the sin is “pretending we do actions one by one and their results are instantly visible across the system”. It becomes even more ridiculous when people of this mindset try to design a distributed system, and as a first step ask for a globally consistent data storage (etcd, postgresql, etc). And due to the forementioned mechanism of reproduction of solutions there is a strong incentive to not change the status quo. LLM-s did not made CPU-s and programming this bad — they just said “Amen” and placed a bold point at the end.

There will be no positive suggestions in this post, welcome to the neurorot era.