A detailed and up-to-date introduction to machine learning, presented through the unifying lens of probabilistic modeling and Bayesian decision theory.
This book offers a detailed and up-to-date introduction to machine learning (including deep learning) through the unifying lens of probabilistic modeling and Bayesian decision theory. The book covers mathematical background (including linear algebra and optimization), basic supervised learning (including linear and logistic regression and deep neural networks), as well as more advanced topics (including transfer learning and unsupervised learning). End-of-chapter exercises allow students to apply what they have learned, and an appendix covers notation.
Probabilistic Machine Learning grew out of the author's 2012 book, Machine Learning: A Probabilistic Perspective. More than just a simple update, this is a completely new book that reflects the dramatic developments in the field since 2012, most notably deep learning. In addition, the new book is accompanied by online Python code, using libraries such as scikit-learn, JAX, PyTorch, and Tensorflow, which can be used to reproduce nearly all the figures; this code can be run inside a web browser using cloud-based notebooks, and provides a practical complement to the theoretical topics discussed in the book. This introductory text will be followed by a sequel that covers more advanced topics, taking the same probabilistic approach.
Kevin P. Murphy is a Research Scientist at Google. Previously, he was Associate Professor of Computer Science and Statistics at the University of British Columbia.
It would be fair to say that if you understand most of the material in this volume, you are almost at the level of a graduate student in the field of machine learning, and reading current arXiv or JMLR papers should be viable for you. Having said that, I must also add that this volume, at least in some chapters, is rather wide than deep, in the sense that there are textbooks dedicated to topics which this book spends only a paragraph or a few pages. I find this understandable, otherwise the author would need to write and publish at least four volumes, instead of two.
Writing such a comprehensive book in the field of machine learning and artificial intelligence (undergoing a massive boom) is no easy task, and needless to say, the author deserves all of the praise. But I must add that having different authors write some of the chapters, no matter how good they are in their respective sub-fields, lead to an unbalanced 'voice', if I may say so. You can see that parts of the book are rushed, not every chapter goes into adequate details, some of the chapters do not contain any exercises, etc. On the other hand, some chapters include a balanced set of exercises to stretch your understanding and make connections to relevant fields, e.g. I liked the exercise about the newsvendor problem among others. Nevertheless, I'm sure the author could have easily done a better pedagogical job had he spent more time, dealing with forward references, redundancies, etc.
There's also an accidental Easter egg of sorts, where the author wrote such and such figure is the one similar to the figures on the book cover, but he must have either copy pasted from the previous edition of ten years ago, or referred to the second volume. ;)
Also, my complaints to MIT Press: you should have done better editing and proofreading, the author and the book deserves that much.
Do I recommend this book? Well, if you want to have a realistic picture of what it'll take to be a researcher in modern machine learning, then you can't go wrong with this volume. On top of that, I definitely want to have my hands on the recently published second volume, because like a movie trailer, there are a lot of forward references to it in this first volume, creating suspense and excitement!
This is the 2nd book of the probabilistic machine learning series and cover more advanced and state-of-art topics. However this is not a book for everyone. Here are some of my feelings after reading the whole series.
1. Who are the series written for? I am not sure who are the perfect target readers. But I am clear who would suffer (like myself). 1.1 readers without solid linear algebra, calculus, probability, statistical inference. (Though it does cover a few chapters on math foundation, I would not recommend readers learn all these from scratch) 1.2 readers without intermediate knowledge or hands on experience in ML (If readers did not run simple linear regression/NN/Tree model before, he would not understand 90% of the materials) 1.3 reader without a broad knowledge view across statistics (inference, bayesian, time series, causal inference), basic ML/DL. You don't have to be expert in all areas (seldom can be), but you are expected to know the basic concept of most areas
I strongly recommend readers to start from introductory ESL and other deep learning books/tutorials first. This is not for beginners, instead, it's for ML/DL veterans.
2. How to best use the series? I viewed the series as a user guide for ML practitioners. It does cover almost all topics in ML/DL area (prediction, inference, generation, discovery) and the material is up to date (at early 2024). The best way to use the book is: a. You are trying to solve a particular problem b. You guess this may be related to XXX and want to learn more details or connections to other approaches c. Go check XXX in the book, it would give you a relative full picture about XXX with reference and correct directions d. Checkout the reference or google topics/things you are interested in, do deep in this area by yourself.
So in general, the book is used as a quick introduction about a particular topic and its most value is show you the connections between this topic with others and how you can learn the basics in 1-2 hours.
3. What to expect from reading the series? The series are designed for repeated reading and the best way to read it is to go with a mission. For example, I would just want to learn what's is VAE and how it works. You would be happy to solve a particular problem and also get a view about how it fits into the broader landscape of ML. Then checkout other resources (github/blog) to code your own solutions from there.
A good survey of ML topics. In many sections, the term survey is appropriate, since it provides a one paragraph overview of the relevant works. I will probably go through the sequel as well.
If you want to get your hands dirty, perhaps not for you. It does not cover coding libraries and exercises are limited to proofs. Much more of a theoretical work and reference guide for researchers.