I have a real problem with this book.
On the one hand, it does its primary job well. It sets out to be a quick and easy overview of the entire field of data compression so that the readers have some idea of the structure of a large field, and what may or may not be ideas relevant to their specific needs and interests.
If this were the entirety of the book, I'd give it four, maybe five stars.
On the other hand...
The book tries to adopt a jokey humorous manner throughout, and while I have no problem with that in principle, there is a real problem when the jokes are riddled with serious misinformation.
Let me give just two examples.
On the less serious side, there is a "joke" that mocks QuickTime as providing technology that was so stupid, har har, that all it could support was MJPEG files until YouTube was launched in 2001 and showed us all how to do things properly. Well, apart from the fact that YouTube was actually launched in 2005, and that QuickTime supported streaming (both live content and stored content) in 1999, this "joke" insults the entire QuickTime team (of which I was a member) by implying that we were all yokels who had no idea what we were doing. In fact QuickTime supported a variety of codecs from day one, and those codecs utilized, in various combinations, pretty much much all of the techniques described in the book.
On the more serious side, the authors make a horrifying mistake by claiming that entropy is a property of a set, rather than of a random variable or of a stochastic process. This then sets them up for making an on-going "joke" that those mathematicians with their entropy measure are so stupid that they can't even see how such obvious transforms as dictionary mechanisms, deltas, or RLE can be used for improved compression. Needless to say, this is complete nonsense --- even Shannon's original paper understood that the point was applied to stochastic PROCESSES, whereby the correlations between successive random variables has important consequences for the definition of entropy and thus for compression --- heck these various correlations and examples of how they play out are one of the standard features of the paper that's frequently quoted, and it appears early on, starting at page 5 with obvious examples on page 6 and page 7.
This matters. The greatest weakness in our current STEM education (IMHO) is the almost invisibility of measure-theoretic probability, which provides, in its language of measures, sigma-algebras, random variables, and stochastic processes, a remarkably powerful set of ideas for thinking about the world, every bit as valuable as sequence, derivative, and integral. But instead of being taught these ideas, most students (even physicists, CS students, and engineers) are taught a kind of "folk probability" that provides a few useful formulas but is essentially mired in confusion of the sort that this book so obviously displays, with no understanding of the difference between a random variable, a set of instantiations of a random variable, a random process, and its instantiations.
I hope the next version of this book tosses the "jokes" and instead includes a chapter (which doesn't have to be scary or complicated) simply laying out the basic foundational concepts of measure-theoretic probability at the same level of informality, and with the same goal of providing a useful overview, that's been applied to the actual compression algorithms.