3.5 stars for accurately portraying just how (un)glamorous it is to train one of these things and how often people completely overlook catastrophic problems during training (we once lost a massive foundational model due to a bit corruption that zeroed out one gradient on one layer). Also on how much we rely on arbitrary (but not really) empirical scaling laws, as well as how society likely starts to legislate on model training (by pushing paperwork onto models above some arbitrary threshold sizes determined by these arbitrary empirical scaling laws)
TPUv10-4096 pods, I wonder how bad the communication overhead will be. But by then we'd probably have clippy designing the network topologies to try to find an optimal way to overlap things...
> HQU has suddenly converged on a model which has the concept of being an agent embedded in a world.
It's still unclear to me why it would undergo a phase transition/groking and retain this as an emergent behavior - would an abstraction of an ego help it compress more of its world model? Is it even necessary to the rest of the story? It just seems like an arbitrary choice to play to the conventional doomsday story of AI gains self-awareness before going rogue.
Clippy takes over the world, via an AI that gains conscience and promptly models itself on all the tales of evil AI. An AI safety tale, standard dystopian. To the tune of To Err Is Human, So Don’t Be One.