An in-depth history of Large Language Models—and what their ubiquity, disruption, and creativity mean from a wider sociopolitical perspective.
In November 2022, ChatGPT swept the globe with a mixed frenzy of excitement and anxiety. Was this a step closer to reaching singularity or just another marvel in machine learning? Author Stephan Raaijmakers provides a comprehensive introduction to Large Language Models (LLMs), describing what exactly they are capable of from a technical and creative standpoint. This concise volume covers everything from the architecture of LLM neural networks to the limitations of LLMs to how our governments can regulate this technology. In explaining how exactly LLMs learn from data sets, Raaijmakers defangs the more sensational arguments we may be familiar with. Instead, he offers a more grounded approach to how this groundbreaking—and increasingly ubiquitous—form of artificial intelligence will shape our society for years to come.
It's written for executives and coming from 18-24 months behind of SOTA and misses critical mention about the time it covers such as RAG and inference-time scaling. Pushing constructor theory without mentioning current memory improvements. Future looking part is weak and stating obvious.
This does a very good job of both introducing (and covering quite a bit) of the subject, and — most importantly — in providing a wide discussion and perspectives on the limitations of these techniques.
It is able to accomplish this in a little over 200 pages of high quality accessible writing (which ups the probability that most people who start this book will read to the end).
There is an extensive set of references for following up ideas in more detail.
This is part of a larger series of MIT Press’s “Essential Knowledge” series of similar excellent slim volumes on important scientific topics.