If you know how to program with Python and also know a little about probability, you’re ready to tackle Bayesian statistics. With this book, you'll learn how to solve statistical problems with Python code instead of mathematical notation, and use discrete probability distributions instead of continuous mathematics. Once you get the math out of the way, the Bayesian fundamentals will become clearer, and you’ll begin to apply these techniques to real-world problems. Bayesian statistical methods are becoming more common and more important, but not many resources are available to help beginners. Based on undergraduate classes taught by author Allen Downey, this book’s computational approach helps you get a solid start.
Allen Downey is a Professor Emeritus at Olin College and the author of a series of freetextbooks related to software and data science, including Think Python, Think Bayes, and Think Complexity, which are also published by O’Reilly Media. His blog, Probably Overthinking It, features articles on Bayesian probability and statistics. He holds a Ph.D. in computer science from U.C. Berkeley, and M.S. and B.S. degrees from MIT.
Science has been described as simply “a collection of successful recipes”. In “Think Bayes” Allen B. Downey has attempted just that by presenting a set of instructional tutorials for teaching bayesian methods with Python. In essence it’s an instructional book with examples that are meant to be straightforward by giving you a simple set of rules in solving more complex sets of problems. The book also makes a few style choices, ignoring continuous distributions in an effort to focus on discrete distributions which makes the math more straightforward. Successful recipes need not be complex in every instance, which this book illustrates effectively.
An important caveat for this book is that it is supplemental material to teaching statistics. Instead of using mathematical notation like many other statistics books it sticks to using python code, for the most part, because its main goal is to construct an applied mathematics educational book. In a sense, this book is a customization of basic mathematical principles to meet the needs of programmers who wish to do statistics, not statisticians wanting to do coding. As someone who has struggled with the mathematical notation of statistics this book presented itself as a clear and dynamic guide into bayesian statistics, starting off very well on page 8 with an astute example of the Monty Hall problem working with python code. The Monty Hall problem being the famous example of conditional probabilities that it is, Allen Downey uses it as a backbone of the book bringing it back up with examples such as Cookie2.py which he makes available on his website freely, as he does the book itself in PDF form.
In thinking through the problem p(D|H), where D is the probability that Monty chooses a door which has no car and H being the probability that Monty in fact does choose the car, I was able to use insights from the book to further my understanding of the mathematical truth of why we should always “switch”. In a narrow window of three likely probabilities it’s much harder for us to see that when Monty opens the first door to show us a zonk that the probability of our potential car will be ⅔ likely to be behind the third door. If we widen that window to a thousand doors and we again choose the first door for monty to open and Monty opens nine hundred and ninety nine doors to show us nine hundred and ninety nine zonks, we can almost feel the weight of the probability of D is true and the probability of H is true funneling into that thousandth door showing us that we should indeed switch.
In chapter 4, the “Euro” problem is explored, which asked is it likely that when spun on edge 250 times, a Belgian one-euro coin came up heads 140 times and tails 110. In this example, Downey exposed the concept of “swamping the priors”, which states: with enough data, people who start with different priors will tend to converge to the same posterior. Even with substantially different priors, Downey shows that the posterior distributions are very similar. The medians and the credible intervals are identical; the means differ by less than 0.5%, in this example, which was a highlight for me in particular because it showed with enough data reasonable people converge.
In conclusion Think Bayes creates opportunities for learning subject matter that would enable you not only to know, but to learn to use what you know in the varied contexts of statistics. It’s a divide and conquer strategy which pairs well with bayesian statistics. On the math side if I go back to the Monty Hall problem as Downey so often does himself, if P(D|H) is harder to deal with than [P(H|D)P(D)]/P(H) , and [P(H|D)P(D)]P(H) is possible then you have an algorithm on your hands. The examples in this book tend to break after four to five dimensions, as they are meant to be instructions of one to two dimensional problems. For that reason it’s a great introduction to bayesian statistics, for more insight it’s recommended to learn a more Markov Chain Monte Carlo approach and to use this book as a supplement into those more complex concepts.
very good Bayesian introduction, specially because it's light on mathematics and full of practical content. I searched for this kind of content for a long time, but was surprised to find in a book like this.
This book is great in term of providing wide range of examples and exercises by those we can understand more about how to "think bayes". However, there are still lacks of detail explaination , and mixture of python code and math is not making it easier to understand.
Good introductory book with interesting example problems. The example code layers abstractions on top of the previously introduced ones from chapter to chapter. Over time it gets hard to comprehend the examples due to class-based polymorphism with multiple levels of inheritance. If not this annoyance, it would be a great book.
Stats education is kind of a mess, or at least it was back when I was a grad student. Every branch of science needs statistics on a fundamental level, but exactly what gets filtered down to the next generation of students varies widely from field to field, each of which has their own lore, notation, and epistemological schisms. I got some formal education in various theorems and whatnot, but when presented with actual research data -- when I actually had to DO something with real data -- I mostly just learned the ropes from colleagues and advisors.
Anyway. Think Bayes is probably not the first or only book you need on Bayesian methods, but it does fill in a an important niche. It's not a heavy formal treatise with proofs and theorems and specialty distributions, but neither is it just someone throwing you the manual for R or Matlab. The book explains the basic method and intuition of Bayesian statistics (prior > likelihood > posterior > repeat) by building up models from scratch using the tools of modern Python (i.e. numpy / scipy / pandas etc). It starts simple and works its way up to useful modern tools like MCMC, and has enough variety that if you're faced with a new problem, there's probably a good, practical jumping off point to be found here.
I accidentally read the first edition of this book rather than the updated second edition, which meant the code used a bespoke library written in a somewhat dated Python style. Nonetheless, the author presents an impressively lucid tour of Bayesian inference without resorting to any formulas more complicated than Bayes' theorem and keeping the implementations fairly straightforward. I particularly enjoyed the way Downey often starts with simple models, then notes their limitations and progressively adds complexity to them.
I would recommend that others seek out the second edition, which is free online, as it adds several new chapters and rewrites the code in more modern Python.
While the methodology behind the framework of the code examples wasn't always obvious (and seemed occasionally overwrought), I think the core statistical concepts come through clearly enough that they could be reimplemented in whatever fashion made most sense to the reader. Generally fairly concise, and generous with graphical outputs as well, which helped solidify conceptual aspects of distributions and their properties.
Allen Downey is a professor who was written several books about python, statistics and several other topics. In Think Bayes he tries to use practical python exercises to teach Bayesian statistics. The problem is that the python code hides a lot of the detail which can be quite confusing in the later chapters. This is a solid book with lots of learning but I wouldn't recommend it to any beginners. Also, the author uses his own python package which can be quite problematic.
Like the applied ethos and no-nonsense fun example problems; a somewhat casual style is also refreshing. However I think the author went overboard; using Python is redundant and simplistic; and a few mathematical expressions would not have hurt anyone.
If you have a basic understanding of Bayes this book will help deepen your intuition. Take time work the examples and problems ( solutions are included) and circle back to the theory. It will help you bridge theory and practice
Interesting examples and a good foundation of Bayesian statistics. Too much polymorphism and inheritance in the code resulting from the use of highly abstracted classes made it difficult to understand the methodology at times
I just don't think it's an efficient introduction to Bayesian statistics. I don't think the book flows well and I think each chapter could double it's length with ease to sufficiently cover the material for a beginner.
Good and practical introduction into Bayesian Statistics using Python. While it won't really teach you how to think Bayes, it offers a number of good and practical examples with good discussion.
Interesting examples and a nice overview of Bayesian modelling. The undocumented python code snippets and lack of mathematical rigour make it hard to use as a reference.
The second edition of this book is updated for Python 3, mostly apply the PyData stack such as NumPy, Pandas, Matplotlib etc. while not relying on higher abstractions such as Pymc3 or Pyro.
Good introduction to Bayesian analysis. I didn't take the time this time through to do all of the code samples and exercises, but I still got a decent overview. One of the best parts was the first really good explanation of the Monty Hall problem that I've seen; I finally understand it!
I'm not giving this up because I didn't find it interesting. I'm putting it on hold because there are some technical books that I need to read first (for work purposes.)