Allen B. Downey's Blog: Probably Overthinking It

September 25, 2025

The Poincaré Problem

Selection bias is the hardest problem in statistics because it’s almost unavoidable in practice, and once the data have been collected, it’s usually not possible to quantify the effect of selection or recover an unbiased estimate of what you are trying to measure.

And because the effect is systematic, not random, it doesn’t help to collect more data. In fact, larger sample sizes make the problem worse, because they give the false impression of precision.

But sometimes, if we are willing to make assumptions about the data generating process, we can use Bayesian methods to infer the effect of selection bias and produce an unbiased estimate.

Click here to run this notebook on Colab.

Poincaré and the Baker

As an example, let’s solve an exercise from Chapter 7 of Think Bayes. It’s based on a fictional anecdote about the mathematician Henri Poincaré:


Supposedly Poincaré suspected that his local bakery was selling loaves of bread that were lighter than the advertised weight of 1 kg, so every day for a year he bought a loaf of bread, brought it home and weighed it. At the end of the year, he plotted the distribution of his measurements and showed that it fit a normal distribution with mean 950 g and standard deviation 50 g. He brought this evidence to the bread police, who gave the baker a warning.


For the next year, Poincaré continued to weigh his bread every day. At the end of the year, he found that the average weight was 1000 g, just as it should be, but again he complained to the bread police, and this time they fined the baker.


Why? Because the shape of the new distribution was asymmetric. Unlike the normal distribution, it was skewed to the right, which is consistent with the hypothesis that the baker was still making 950 g loaves, but deliberately giving Poincaré the heavier ones.


To see whether this anecdote is plausible, let’s suppose that when the baker sees Poincaré coming, he hefts k loaves of bread and gives Poincaré the heaviest one. How many loaves would the baker have to heft to make the average of the maximum 1000 g?


How Many Loaves?

Here are distributions with the same underlying normal distribution and different values of k.

mu_true, sigma_true = 950, 50

As k increases, the mean increases and the standard deviation decreases.

When k=4, the mean is close to 1000. So let’s assume the baker hefted four loaves and gave the heaviest to Poincaré.

At the end of one year, can we tell the difference between the following possibilities?

Innocent: The baker actually increased the mean to 1000, and k=1.Shenanigans: The mean was still 950, but the baker selected with k=4.

Here’s a sample under the k=4 scenario, compared to 10 samples with the same mean and standard deviation, and k=1.

The k=4 distribution falls mostly within the range of variation we’d expect from the k=1 distribution (with the same mean and standard deviation). If you were on the jury and saw this evidence, would you convict the baker?

Ask a Bayesian

As a Bayesian approach to this problem, let’s see if we can use this data to estimate k and the parameters of the underlying distribution. Here’s a PyMC model that

Defines prior distributions for mu, sigma, and k, andUses a custom distribution that computes the likelihood of the data for a hypothetical set of parameters (see the notebook for details).def make_model(sample): with pm.Model() as model: mu = pm.Normal("mu", mu=950, sigma=30) sigma = pm.HalfNormal("sigma", sigma=30) k = pm.Uniform("k", lower=0.5, upper=15) obs = pm.CustomDist( "obs", mu, sigma, k, logp=max_normal_logp, observed=sample, ) return model

Notice that we treat k as continuous. That’s because continuous parameters are much easier to sample (and the log PDF function allows non-integer values of k). But it also make sense in the context of the problem – for example, if the baker sometimes hefts three loaves and sometimes four, we can approximate the distribution of the maximum with k=3.5.

The model runs quickly and the diagnostics look good. Here are the posterior distributions of the parameters compared to their known values.

Posterior distribution of mu showing the posterior mean is 940 compared to the true value 950.Posterior distribution of sigma showing the posterior mean is 54 compared to the true value 50.Posterior distribution of k showing the posterior mean is 5.5 compared to the true value 4.

With one year of data, we can recover the parameters pretty well. The true values fall comfortably inside the posterior distributions, and the posterior mode of k is close to the true value, 4.

But the posterior distributions are still quite wide. There is even some possibility that the baker is innocent, although it is small.

Conclusion

This example shows that we can use the shape of an observed distribution to estimate the effect of selection bias and recover the unbiased latent distribution. But we might need a lot of data, and the inference depends on strong assumptions about the data generating process.

Credits: I don’t remember where I got this example from (maybe here?), but it appears in Leonard Mlodinov, The Drunkard’s Walk (2008). Mlodinov credits Bart Holland, What Are the Chances? (2002). The ultimate source seems to be George Gamow and Marvin Stern, Puzzle Math (1958) – but their version is about a German professor, not Poincaré.

You can order print and ebook versions of Think Bayes 2e from Bookshop.org and Amazon.

The post The Poincaré Problem appeared first on Probably Overthinking It.

 •  0 comments  •  flag
Share on Twitter
Published on September 25, 2025 14:46

September 22, 2025

Think Linear Algebra

I have published the first five chapters of Think Linear Algebra! You can read them here or follow these links to run the notebooks on Colab. Here are the chapters I have so far:

Chapter 1: The Power of Linear Algebra
Introduces matrix multiplication and eigenvectors through a network-based model of museum traffic, and implements the PageRank algorithm for quantifying the quality of web pages.

Chapter 5: To Boldly Go
Uses matrices scale, rotate, shear, and translate vectors. Applies these methods to 2D compute graphics, including a reimplementation of the classic video game Asteroids.

Chapter 7: Systems of Equations
Applies LU decomposition and matrix equations to analyze electrical circuits. Shows how linear algebra solves real engineering problems.

Chapter 8: Null Space
Investigates chemical stoichiometry as a system with multiple valid solutions. Introduces concepts of rank and nullspace to describe the solution space.

Chapter 9: Truss the System
Models structural systems where the unknowns are vector forces. Uses block matrices and rank analysis to compute internal stresses in trusses.

As you can tell by the chapter numbers, there is more to come — although the sequence of topics might change.

If you are curious about this project, here’s more about why I’m writing this book.

Math is not real

In this previous article, I wrote about “math supremacy”, which is the idea that math notation is the real thing, and everything else — including and especially code — is an inferior imitation.

I am confronted with math supremacy more often than most people, because I write books that use code to present ideas that are usually expressed in math notation. I think code can be simpler and clearer, but not everyone agrees, and some of them disagree loudly.

With Think Linear Algebra, I am taking my “code first” approach deep into the domain of math supremacy. Today I was using block matrices to analyze a truss, an example I remember seeing in my college linear algebra class. I remember that I did not find the example particularly compelling, because after setting up the problem — and it takes a lot of setting up — we never really finished it. That is, we talked about how to analyze a truss, hypothetically, but we never actually did it.

This is a fundamental problem with the way math is taught in engineering and the sciences. We send students off to the math department to take calculus and linear algebra, we hope they will be able to apply it to classes in their major, and we are disappointed — and endlessly surprised — when they can’t.

Part of the problem is that transfer of learning is much harder than many people realize, and does not happen automatically, as many teachers expect.

Another part of the problem is what I wrote about in Modeling and Simulation in Python: a complete modeling process involves abstraction, analysis, and validation. In most classes we only teach analysis, neglecting the other steps, and in some math classes we don’t even do that — we set up the tools to do analysis and never actually do it.

This is the power of the computational approach — we can demonstrate all of the steps, and actually solve the problem. So I find it ironic when people dismiss computation and ask for the “math behind it”, as if theory is reality and reality is a pale imitation. Math is a powerful tool for analysis, but to solve real problems, it is not the only tool we need. And it is not, contrary to Plato, more real than reality.

The post Think Linear Algebra appeared first on Probably Overthinking It.

 •  0 comments  •  flag
Share on Twitter
Published on September 22, 2025 07:05

May 28, 2025

Announcing Think Linear Algebra

I’ve been thinking about Think Linear Algebra for more than a decade, and recently I started working on it in earnest. If you want to get a sense of it, I’ve posted a draft chapter as a Jupyter notebook.

In one way, I am glad I waited — I think it will be better, faster [to write], and stronger [?] because of AI tools. To be clear, I am writing this book, not AI. But I’m finding ChatGPT helpful for brainstorming and Copilot and Cursor helpful for generating and testing code.

If you are curious, here’s my discussion with ChatGPT about that sample chapter. Before you read it, I want to say in my defense that I often ask questions where I think I know the answer, as a way of checking my understanding without leading too strongly. That way I avoid one of the more painful anti-patterns of working with AI tools, the spiral of confusion the can happen if you start from an incorrect premise.

My next step is to write a proposal, and I will probably use AI tools for that, too. Here’s a first draft that outlines the features I have in mind:

1. Case-Based, Code-First

Each chapter is built around a case study—drawn from engineering, physics, signal processing, or beyond—that demonstrates the power of linear algebra methods. These examples unfold in Jupyter notebooks that combine explanation, Python code, visualizations, and exercises, all in one place.

2. Multiple Computational Perspectives

The book uses a variety of tools—NumPy for efficient arrays, SciPy for numerical methods, SymPy for symbolic manipulation, and even NetworkX for graph-based systems. Readers see how different libraries offer different lenses on the same mathematical ideas—and how choosing the right one can make thinking and doing more effective.

3. Top-Down Learning

Rather than starting from scratch with low-level implementations, we use robust, well-tested libraries from day one. That way, readers can solve real problems immediately, and explore how the algorithms work only when it’s useful to do so. This approach makes linear algebra more motivating, more intuitive—and more fun.

4. Linear Algebra as a Language for Thought

Vectors and matrices are more than data structures—they’re conceptual tools. By expressing problems in linear algebra terms, readers learn to think in higher-level chunks and unlock general-purpose solutions. Instead of custom code for each new problem, they learn to use elegant, efficient abstractions. As I wrote in Programming as a Way of Thinking, modern programming lets us collapse the gap between expressing, exploring, and executing ideas.

Finally, here’s what ChatGPT thinks the cover should look like:

The post Announcing Think Linear Algebra appeared first on Probably Overthinking It.

 •  0 comments  •  flag
Share on Twitter
Published on May 28, 2025 10:00

May 22, 2025

My very busy week

I’m not sure who scheduled ODSC and PyConUS during the same week, but I am unhappy with their decisions. Last Tuesday I presented a talk and co-presented a workshop at ODSC, and on Thursday I presented a tutorial at PyCon.

If you would like to follow along with my very busy week, here are the resources:

Practical Bayesian Modeling with PyMC

Co-presented with Alex Fengler for ODSC East 2025

In this tutorial, we explore Bayesian regression using PyMC – the primary library for Bayesian sampling in Python – focusing on survey data and other datasets with categorical outcomes. Starting with logistic regression, we’ll build up to categorical and ordered logistic regression, showcasing how Bayesian approaches provide versatile tools for developing and evaluating complex models. Participants will leave with practical skills for implementing Bayesian regression models in PyMC, along with a deeper appreciation for the power of Bayesian inference in real-world data analysis. Participants should be familiar with Python, the SciPy ecosystem, and basic statistics, but no experience with Bayesian methods is required.

The repository for this tutorial is here; it includes notebooks where you can run the examples, and there’s a link to the slides.

And then later that day I presented…

Mastering Time Series Analysis with StatsModels: From Decomposition to ARIMA

Time series analysis provides essential tools for modeling and predicting time-dependent data, especially data exhibiting seasonal patterns or serial correlation. This tutorial covers tools in the StatsModels library including seasonal decomposition and ARIMA. As examples, we’ll look at weather data and electricity generation from renewable sources in the United States since 2004 — but the methods we’ll cover apply to many kinds of real-world time series data. Outline Introduction to time series Overview of the data Seasonal decomposition, additive model Seasonal decomposition, multiplicative model Serial correlation and autoregression ARIMA Seasonal ARIMA

This talk is based on Chapter 12 of the new edition of Think Stats. Here are the slides.

Unfortunately there’s no video from the talk, but I presented related material in this workshop for PyData Global 2024:

After the talk, Seamus McGovern presented me with an award for being, apparently, the most frequent ODSC speaker!

On Wednesday I flew to Pittsburgh, and on Thursday I presented…

Analyzing Survey Data with Pandas and StatsModels

PyConUS 2025 tutorial

Whether you are working with customer data or tracking election polls, Pandas and StatsModels provide powerful tools for getting insights from survey data. In this tutorial, we’ll start with the basics and work up to age-period-cohort analysis and logistic regression. As examples, we’ll use data from the General Social Survey to see how political beliefs have changed over the last 50 years in the United States. We’ll follow the essential steps of a data science project, from loading and validating data, exploring and visualizing, modeling and predicting, and communicating results.

Here’s the repository with the notebooks and a link to the slides.

Sadly, the tutorial was not recorded.

Now that I have a moment of calm, I’m getting back to Think Linear Algebra. More about that soon!

The post My very busy week appeared first on Probably Overthinking It.

 •  0 comments  •  flag
Share on Twitter
Published on May 22, 2025 07:31

April 6, 2025

Announcing Think Stats 3e

The third edition of Think Stats is on its way to the printer! You can preorder now from Bookshop.org and Amazon (those are affiliate links), or if you can’t wait to get a paper copy, you can read the free, online version here.

Here’s the new cover, still featuring a suspicious-looking archerfish.

If you are not familiar with the previous editions, Think Stats is an introduction to practical methods for exploring and visualizing data, discovering relationships and trends, and communicating results.
The organization of the book follows the process I use when I start working with a dataset:

Importing and cleaning: Whatever format the data is in, it usually takes some time and effort to read the data, clean and transform it, and check that everything made it through the translation process intact.Single variable explorations: I usually start by examining one variable at a time, finding out what the variables mean, looking at distributions of the values, and choosing appropriate summary statistics.Pair-wise explorations: To identify possible relationships between variables, I look at tables and scatter plots, and compute correlations and linear fits.Multivariate analysis: If there are apparent relationships between variables, I use multiple regression to add control variables and investigate more complex relationships.Estimation and hypothesis testing: When reporting statistical results, it is important to answer three questions: How big is the effect? How much variability should we expect if we run the same measurement again? Is it plausible that the apparent effect is due to chance?Visualization: During exploration, visualization is an important tool for finding possible relationships and effects. Then if an apparent effect holds up to scrutiny, visualization is an effective way to communicate results.What’s new?

For the third edition, I started by moving the book into Jupyter notebooks. This change has one immediate benefit — you can read the text, run the code, and work on the exercises all in one place. And the notebooks are designed to work on Google Colab, so you can get started without installing anything.

The move to notebooks has another benefit — the code is more visible. In the first two editions, some of the code was in the book and some was in supporting files available online. In retrospect, it’s clear that splitting the material in this way was not ideal, and it made the code more complicated than it needed to be. In the third edition, I was able to simplify the code and make it more readable.

Since the last edition was published, I’ve developed a library called empiricaldist that provides objects that represent statistical distributions. This library is more mature now, so the updated code makes better use of it.

When I started this project, NumPy and SciPy were not as widely used, and Pandas even less, so the original code used Python data structures like lists and dictionaries. This edition uses arrays and Pandas structures extensively, and makes more use of functions these libraries provide.

The third edition covers the same topics as the original, in almost the same order, but the text is substantially revised. Some of the examples are new; others are updated with new data. I’ve developed new exercises, revised some of the old ones, and removed a few. I think the updated exercises are better connected to the examples, and more interesting.

Since the first edition, this book has been based on the thesis that many ideas that are hard to explain with math are easier to explain with code. In this edition, I have doubled down on this idea, to the point where there is almost no mathematical notation left.

New Data, New Examples

In the previous edition, I was not happy with the chapter on time-series analysis, so I almost entirely replaced it, using as an example data on renewable electricity generation from U.S. Energy Information Administration. This dataset is more interesting than the one it replaced, and it works better with time-series methods, including seasonal decomposition and ARIMA.

Example graph from Chapter 12 of Think Stats, showing electricity production from solar power in the US from 2014 to 2024.

Example from Chapter 12, showing electricity production from solar power in the US.

And for the chapters on regression (simple and multiple) I couldn’t resist using the now-famous Palmer penguin dataset.

Example from Chapter 10 of Think Stats, showing a scatter plot of penguin measurements.

Example from Chapter 10, showing a scatter plot of penguin measurements.

Other examples use some of the same datasets from the previous edition, including the National Survey of Family Growth (NSFG) and Behavioral Risk Factor Surveillance System (BRFSS).

Overall, I’m very happy with the results. I hope you like it!

The post Announcing Think Stats 3e appeared first on Probably Overthinking It.

 •  0 comments  •  flag
Share on Twitter
Published on April 06, 2025 07:09

March 19, 2025

Young Adults Want Fewer Children

The most recent data from the National Survey of Family Growth (NSFG) provides a first look at people born in the 2000s as young adults and an updated view of people born in the 1990s at the peak of their child-bearing years. Compared to previous generations at the same ages, these cohorts have fewer children, and they are less likely to say they intend to have children. Unless their plans change, trends toward lower fertility are likely to continue for the next 10-20 years.

The following figure shows the number of children fathered by male respondents as a function of their age when interviewed, grouped by decade of birth. It includes the most recent data, collected in 2022-23, combined with data from previous iterations of the survey going back to 1982.

Men born in the 1990s and 2000s have fathered fewer children than previous generations at the same ages:

At age 33, men born in the 1990s (blue line) have 0.6 children on average, compared to 1.1 – 1.4 in previous cohorts. At age 24, men born in the 2000s (violet line) have 0.1 children on average, compared to 0.2 – 0.4 in previous cohorts.

The pattern is similar for women.

Women born in the 1990s and 2000s are having fewer children, later, than previous generations. 

At age 33, women in the 1990s cohort have 1.4 children on average, compared to 1.7 – 1.8 in previous cohorts. At age 24, women in the 2000s cohort have 0.3 children on average, compared to 0.6 0.8 in previous cohorts. Desires and Intentions

The NSFG asks respondents whether they want to have children and whether they intend to. These questions are useful because they distinguish between two possible causes of declining fertility. If someone says they want a child, but don’t intend to have one, it seems like something is standing in their way. In that case, changing circumstances might change their intentions. But if they don’t want children, that might be less likely to change.

Let’s start with stated desires. The following figure shows the fraction of men who say they want a child — or another child if they have at least one — grouped by decade of birth.

Men born in the 2000s are less likely to say they want to have a child — about 86% compared to 92% in previous cohorts. Men born in the 1990s are indistinguishable from previous cohorts.

The pattern is similar for women — the following figure shows the fraction who say they want a baby, grouped by decade of birth.

Women born in the 2000s are less likely to say they want a baby — about 76%, compared to 87% for previous cohorts when they were interviewed at the same ages. Women born in the 1990s are in line with previous generations. 

Maybe surprisingly, men are more likely to say they want children.  For example, of young men (15 to 24) born in the 2000s, 86% say they want children, compared to 76% of their female peers. Lyman Stone wrote about this pattern recently.

What About Intentions?

The patterns are similar when people are asked whether they intend to have a child. Men and women born in the 1990s are indistinguishable from previous generations, but

Men born in the 2000s are less likely to say they intend to have a child — about 80% compared to 85–86% in previous cohorts at the same ages (15 to 24).Women born in the 2000s are less likely to say they intend to have a child — about 69% compared to 80–82% in previous cohorts.

Now let’s look more closely at the difference between wants and intentions. The following figure shows the percentage of men who want a child minus the percentage who intend to have a child.

Among young men, the difference is small — most people who want a child intend to have one. The difference increases with age. Among men in their 30s, a substantial number say they would like another child but don’t intend to have one.

Here are the same differences for women.

The patterns are similar — among young women, most who want a child intend to have one. Among women in their 30s, the gap sometimes exceeds 20 percentage points, but might be decreasing in successive generations.

These results suggest that fertility is lower among people born in the 1990s and 2000s — at least so far — because they want fewer children, not because circumstances prevent them from having the children they want.

From the point of view of reproductive freedom, that conclusion is better than an alternative where people want children but can’t have them. But from the perspective of public policy, these results suggest that reversing these trends would be difficult: removing barriers is relatively easy — changing what people want is generally harder.

DATA NOTE: In the most recent iteration of the NSFG, about 75% of respondents were surveyed online; the other 25% were interviewed face-to-face, as all respondents were in previous iterations. Changes like this can affect the results, especially for more sensitive questions. And in the NSFG, Lyman Stone has pointed out that there are non-negligible differences when we compare online and face-to-face responses. Specifically, people who responded online were less likely to say they want children and less likely to say they intend to have children. At first consideration, it’s possible that these differences could be due to social desirability bias.

However, people who responded online also reported substantially lower parity (women) and number of biological children (men), on average, than people interviewed face-to-face — and it is much less likely that these responses depend on interview format. It is more likely that the way respondents were assigned to different formats depended on parity/number of children, and that difference explains the observed differences in desire and intent for more children. Since there is no strong evidence that the change in format accounts for the differences we see, I’m taking the results at face value for now.

The post Young Adults Want Fewer Children appeared first on Probably Overthinking It.

 •  0 comments  •  flag
Share on Twitter
Published on March 19, 2025 06:37

Young Adults Are Having the Children They Want — But They Want Fewer

The most recent data from the National Survey of Family Growth (NSFG) provides a first look at people born in the 2000s as young adults and an updated view of people born in the 1990s at the peak of their child-bearing years. Compared to previous generations at the same ages, these cohorts have fewer children, and they are less likely to say they intend to have children. Unless their plans change, trends toward lower fertility are likely to continue for the next 10-20 years.

The following figure shows the number of children fathered by male respondents as a function of their age when interviewed, grouped by decade of birth. It includes the most recent data, collected in 2022-23, combined with data from previous iterations of the survey going back to 1982.

Men born in the 1990s and 2000s have fathered fewer children than previous generations at the same ages:

At age 33, men born in the 1990s (blue line) have 0.6 children on average, compared to 1.1 – 1.4 in previous cohorts. At age 24, men born in the 2000s (violet line) have 0.1 children on average, compared to 0.2 – 0.4 in previous cohorts.

The pattern is similar for women.

Women born in the 1990s and 2000s are having fewer children, later, than previous generations. 

At age 33, women in the 1990s cohort have 1.4 children on average, compared to 1.7 – 1.8 in previous cohorts. At age 24, women in the 2000s cohort have 0.3 children on average, compared to 0.6 0.8 in previous cohorts. Desires and Intentions

The NSFG asks respondents whether they want to have children and whether they intend to. These questions are useful because they distinguish between two possible causes of declining fertility. If someone says they want a child, but don’t intend to have one, it seems like something is standing in their way. In that case, changing circumstances might change their intentions. But if they don’t want children, that might be less likely to change.

Let’s start with stated desires. The following figure shows the fraction of men who say they want a child — or another child if they have at least one — grouped by decade of birth.

Men born in the 2000s are less likely to say they want to have a child — about 86% compared to 92% in previous cohorts. Men born in the 1990s are indistinguishable from previous cohorts.

The pattern is similar for women — the following figure shows the fraction who say they want a baby, grouped by decade of birth.

Women born in the 2000s are less likely to say they want a baby — about 76%, compared to 87% for previous cohorts when they were interviewed at the same ages. Women born in the 1990s are in line with previous generations. 

Maybe surprisingly, men are more likely to say they want children.  For example, of young men (15 to 24) born in the 2000s, 86% say they want children, compared to 76% of their female peers. Lyman Stone wrote about this pattern recently.

What About Intentions?

The patterns are similar when people are asked whether they intend to have a child. Men and women born in the 1990s are indistinguishable from previous generations, but

Men born in the 2000s are less likely to say they intend to have a child — about 80% compared to 85–86% in previous cohorts at the same ages (15 to 24).Women born in the 2000s are less likely to say they intend to have a child — about 69% compared to 80–82% in previous cohorts.

Now let’s look more closely at the difference between wants and intentions. The following figure shows the percentage of men who want a child minus the percentage who intend to have a child.

Among young men, the difference is small — most people who want a child intend to have one. The difference increases with age. Among men in their 30s, a substantial number say they would like another child but don’t intend to have one.

Here are the same differences for women.

The patterns are similar — among young women, most who want a child intend to have one. Among women in their 30s, the gap sometimes exceeds 20 percentage points, but might be decreasing in successive generations.

These results suggest that fertility is lower among people born in the 1990s and 2000s — at least so far — because they want fewer children, not because circumstances prevent them from having the children they want.

From the point of view of reproductive freedom, that conclusion is better than an alternative where people want children but can’t have them. But from the perspective of public policy, these results suggest that reversing these trends would be difficult: removing barriers is relatively easy — changing what people want is generally harder.

DATA NOTE: In the most recent iteration of the NSFG, about 75% of respondents were surveyed online; the other 25% were interviewed face-to-face, as all respondents were in previous iterations. Changes like this can affect the results, especially for more sensitive questions. And in the NSFG, Lyman Stone has pointed out that there are non-negligible differences when we compare online and face-to-face responses. Specifically, people who responded online were less likely to say they want children and less likely to say they intend to have children. At first consideration, it’s possible that these differences could be due to social desirability bias.

However, people who responded online also reported substantially lower parity (women) and number of biological children (men), on average, than people interviewed face-to-face — and it is much less likely that these responses depend on interview format. It is more likely that the way respondents were assigned to different formats depended on parity/number of children, and that difference explains the observed differences in desire and intent for more children. Since there is no strong evidence that the change in format accounts for the differences we see, I’m taking the results at face value for now.

 •  0 comments  •  flag
Share on Twitter
Published on March 19, 2025 06:37

January 20, 2025

Algorithmic Fairness

This is the last in a series of excerpts from Elements of Data Science, now available from Lulu.com and online booksellers.

This article is based on the Recidivism Case Study, which is about algorithmic fairness. The goal of the case study is to explain the statistical arguments presented in two articles from 2016:

Machine Bias”, by Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner, and published by ProPublica.A response by Sam Corbett-Davies, Emma Pierson, Avi Feller and Sharad Goel: “A computer program used for bail and sentencing decisions was labeled biased against blacks. It’s actually not that clear.”, published in the Washington Post.

Both are about COMPAS, a statistical tool used in the justice system to assign defendants a “risk score” that is intended to reflect the risk that they will commit another crime if released.

The ProPublica article evaluates COMPAS as a binary classifier, and compares its error rates for black and white defendants. In response, the Washington Post article shows that COMPAS has the same predictive value black and white defendants. And they explain that the test cannot have the same predictive value and the same error rates at the same time.

In the first notebook I replicated the analysis from the ProPublica article. In the second notebook I replicated the analysis from the WaPo article. In this article I use the same methods to evaluate the performance of COMPAS for male and female defendants. I find that COMPAS is unfair to women: at every level of predicted risk, women are less likely to be arrested for another crime.

You can run this Jupyter notebook on Colab.

Male and female defendants

The authors of the ProPublica article published a supplementary article, How We Analyzed the COMPAS Recidivism Algorithm, which describes their analysis in more detail. In the supplementary article, they briefly mention results for male and female respondents:

The COMPAS system unevenly predicts recidivism between genders. According to Kaplan-Meier estimates, women rated high risk recidivated at a 47.5 percent rate during two years after they were scored. But men rated high risk recidivated at a much higher rate – 61.2 percent – over the same time period. This means that a high-risk woman has a much lower risk of recidivating than a high-risk man, a fact that may be overlooked by law enforcement officials interpreting the score.

We can replicate this result using the methods from the previous notebooks; we don’t have to do Kaplan-Meier estimation.

According to the binary gender classification in this dataset, about 81% of defendants are male.

male = cp["sex"] == "Male"male.mean()0.8066260049902967female = cp["sex"] == "Female"female.mean()0.19337399500970334

Here are the confusion matrices for male and female defendants.

from rcs_utils import make_matrixmatrix_male = make_matrix(cp[male])matrix_malePred PositivePred NegativeActualPositive17321021Negative9942072matrix_female = make_matrix(cp[female])matrix_femalePred PositivePred NegativeActualPositive303195Negative288609

And here are the metrics:

from rcs_utils import compute_metricsmetrics_male = compute_metrics(matrix_male, "Male defendants")metrics_malePercentMale defendantsFPR32.4FNR37.1PPV63.5NPV67.0Prevalence47.3metrics_female = compute_metrics(matrix_female, "Female defendants")metrics_femalePercentFemale defendantsFPR32.1FNR39.2PPV51.3NPV75.7Prevalence35.7

The fraction of defendants charged with another crime (prevalence) is substantially higher for male defendants (47% vs 36%).

Nevertheless, the error rates for the two groups are about the same. As a result, the predictive values for the two groups are substantially different:

PPV: Women classified as high risk are less likely to be charged with another crime, compared to high-risk men (51% vs 64%).NPV: Women classified as low risk are more likely to “survive” two years without a new charge, compared to low-risk men (76% vs 67%).

The difference in predictive values implies that COMPAS is not calibrated for men and women. Here are the calibration curves for male and female defendants.

_images/0c00fcec5fcb5d27076980d67c956f77bd0f84a3c39072a7a423b9f462b40780.png

For all risk scores, female defendants are substantially less likely to be charged with another crime. Or, reading the graph the other way, female defendants are given risk scores 1-2 points higher than male defendants with the same actual risk of recidivism.

To the degree that COMPAS scores are used to decide which defendants are incarcerated, those decisions:

Are unfair to women.Are less effective than they could be, if they incarcerate lower-risk women while allowing higher-risk men to go free.What would it take?

Suppose we want to fix COMPAS so that predictive values are the same for male and female defendants. We could do that by using different thresholds for the two groups. In this section, we’ll see what it would take to re-calibrate COMPAS; then we’ll find out what effect that would have on error rates.

From the previous notebook, sweep_threshold loops through possible thresholds, makes the confusion matrix for each threshold, and computes the accuracy metrics. Here are the resulting tables for all defendants, male defendants, and female defendants.

from rcs_utils import sweep_threshold

table_all = sweep_threshold(cp)table_male = sweep_threshold(cp[male])table_female = sweep_threshold(cp[female])

As we did in the previous notebook, we can find the threshold that would make predictive value the same for both groups.

from rcs_utils import predictive_valuematrix_all = make_matrix(cp)ppv, npv = predictive_value(matrix_all)from rcs_utils import crossingcrossing(table_male["PPV"], ppv)array(3.36782883)crossing(table_male["NPV"], npv)array(3.40116329)

With a threshold near 3.4, male defendants would have the same predictive values as the general population. Now let’s do the same computation for female defendants.

crossing(table_female["PPV"], ppv)array(6.88124668)crossing(table_female["NPV"], npv)array(6.82760429)

To get the same predictive values for men and women, we would need substantially different thresholds: about 6.8 compared to 3.4. At those levels, the false positive rates would be very different:

from rcs_utils import interpolateinterpolate(table_male["FPR"], 3.4)array(39.12)interpolate(table_female["FPR"], 6.8)array(9.14)

And so would the false negative rates.

interpolate(table_male["FNR"], 3.4)array(30.98)interpolate(table_female["FNR"], 6.8)array(74.18)

If the test is calibrated in terms of predictive value, it is uncalibrated in terms of error rates.

ROC

In the previous notebook I defined the receiver operating characteristic (ROC) curve. The following figure shows ROC curves for male and female defendants:

from rcs_utils import plot_rocplot_roc(table_male)plot_roc(table_female)_images/6c3b663c6aee1331f0bd08cae9c973892e52b264950db4baa0d3db68ea2dc790.png

The ROC curves are nearly identical, which implies that it is possible to calibrate COMPAS equally for male and female defendants.

Summary

With respect to sex, COMPAS is fair by the criteria posed by the ProPublica article: it has the same error rates for groups with different prevalence. But it is unfair by the criteria of the WaPo article, which argues:

A risk score of seven for black defendants should mean the same thing as a score of seven for white defendants. Imagine if that were not so, and we systematically assigned whites higher risk scores than equally risky black defendants with the goal of mitigating ProPublica’s criticism. We would consider that a violation of the fundamental tenet of equal treatment.

With respect to male and female defendants, COMPAS violates this tenet.

So who’s right? We have two competing definitions of fairness, and it is mathematically impossible to satisfy them both. Is it better to have equal error rates for all groups, as COMPAS does for men and women? Or is it better to be calibrated, which implies equal predictive values? Or, since we can’t have both, should the test be “tempered”, allowing both error rates and predictive values to depend on prevalence?

In the next notebook I explore these trade-offs in more detail. And I summarized these results in Chapter 9 of Probably Overthinking It.

 •  0 comments  •  flag
Share on Twitter
Published on January 20, 2025 08:10

This is the last in a series of excerpts from Elements of...

This is the last in a series of excerpts from Elements of Data Science, now available from Lulu.com and online booksellers.

This article is based on the Recidivism Case Study, which is about algorithmic fairness. The goal of the case study is to explain the statistical arguments presented in two articles from 2016:

Machine Bias”, by Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner, and published by ProPublica.A response by Sam Corbett-Davies, Emma Pierson, Avi Feller and Sharad Goel: “A computer program used for bail and sentencing decisions was labeled biased against blacks. It’s actually not that clear.”, published in the Washington Post.

Both are about COMPAS, a statistical tool used in the justice system to assign defendants a “risk score” that is intended to reflect the risk that they will commit another crime if released.

The ProPublica article evaluates COMPAS as a binary classifier, and compares its error rates for black and white defendants. In response, the Washington Post article shows that COMPAS has the same predictive value black and white defendants. And they explain that the test cannot have the same predictive value and the same error rates at the same time.

In the first notebook I replicated the analysis from the ProPublica article. In the second notebook I replicated the analysis from the WaPo article. In this article I use the same methods to evaluate the performance of COMPAS for male and female defendants. I find that COMPAS is unfair to women: at every level of predicted risk, women are less likely to be arrested for another crime.

You can run this Jupyter notebook on Colab.

Male and female defendants

The authors of the ProPublica article published a supplementary article, How We Analyzed the COMPAS Recidivism Algorithm, which describes their analysis in more detail. In the supplementary article, they briefly mention results for male and female respondents:

The COMPAS system unevenly predicts recidivism between genders. According to Kaplan-Meier estimates, women rated high risk recidivated at a 47.5 percent rate during two years after they were scored. But men rated high risk recidivated at a much higher rate – 61.2 percent – over the same time period. This means that a high-risk woman has a much lower risk of recidivating than a high-risk man, a fact that may be overlooked by law enforcement officials interpreting the score.

We can replicate this result using the methods from the previous notebooks; we don’t have to do Kaplan-Meier estimation.

According to the binary gender classification in this dataset, about 81% of defendants are male.

male = cp["sex"] == "Male"male.mean()0.8066260049902967female = cp["sex"] == "Female"female.mean()0.19337399500970334

Here are the confusion matrices for male and female defendants.

from rcs_utils import make_matrixmatrix_male = make_matrix(cp[male])matrix_malePred PositivePred NegativeActualPositive17321021Negative9942072matrix_female = make_matrix(cp[female])matrix_femalePred PositivePred NegativeActualPositive303195Negative288609

And here are the metrics:

from rcs_utils import compute_metricsmetrics_male = compute_metrics(matrix_male, "Male defendants")metrics_malePercentMale defendantsFPR32.4FNR37.1PPV63.5NPV67.0Prevalence47.3metrics_female = compute_metrics(matrix_female, "Female defendants")metrics_femalePercentFemale defendantsFPR32.1FNR39.2PPV51.3NPV75.7Prevalence35.7

The fraction of defendants charged with another crime (prevalence) is substantially higher for male defendants (47% vs 36%).

Nevertheless, the error rates for the two groups are about the same. As a result, the predictive values for the two groups are substantially different:

PPV: Women classified as high risk are less likely to be charged with another crime, compared to high-risk men (51% vs 64%).NPV: Women classified as low risk are more likely to “survive” two years without a new charge, compared to low-risk men (76% vs 67%).

The difference in predictive values implies that COMPAS is not calibrated for men and women. Here are the calibration curves for male and female defendants.

_images/0c00fcec5fcb5d27076980d67c956f77bd0f84a3c39072a7a423b9f462b40780.png

For all risk scores, female defendants are substantially less likely to be charged with another crime. Or, reading the graph the other way, female defendants are given risk scores 1-2 points higher than male defendants with the same actual risk of recidivism.

To the degree that COMPAS scores are used to decide which defendants are incarcerated, those decisions:

Are unfair to women.Are less effective than they could be, if they incarcerate lower-risk women while allowing higher-risk men to go free.What would it take?

Suppose we want to fix COMPAS so that predictive values are the same for male and female defendants. We could do that by using different thresholds for the two groups. In this section, we’ll see what it would take to re-calibrate COMPAS; then we’ll find out what effect that would have on error rates.

From the previous notebook, sweep_threshold loops through possible thresholds, makes the confusion matrix for each threshold, and computes the accuracy metrics. Here are the resulting tables for all defendants, male defendants, and female defendants.

from rcs_utils import sweep_threshold

table_all = sweep_threshold(cp)table_male = sweep_threshold(cp[male])table_female = sweep_threshold(cp[female])

As we did in the previous notebook, we can find the threshold that would make predictive value the same for both groups.

from rcs_utils import predictive_valuematrix_all = make_matrix(cp)ppv, npv = predictive_value(matrix_all)from rcs_utils import crossingcrossing(table_male["PPV"], ppv)array(3.36782883)crossing(table_male["NPV"], npv)array(3.40116329)

With a threshold near 3.4, male defendants would have the same predictive values as the general population. Now let’s do the same computation for female defendants.

crossing(table_female["PPV"], ppv)array(6.88124668)crossing(table_female["NPV"], npv)array(6.82760429)

To get the same predictive values for men and women, we would need substantially different thresholds: about 6.8 compared to 3.4. At those levels, the false positive rates would be very different:

from rcs_utils import interpolateinterpolate(table_male["FPR"], 3.4)array(39.12)interpolate(table_female["FPR"], 6.8)array(9.14)

And so would the false negative rates.

interpolate(table_male["FNR"], 3.4)array(30.98)interpolate(table_female["FNR"], 6.8)array(74.18)

If the test is calibrated in terms of predictive value, it is uncalibrated in terms of error rates.

ROC

In the previous notebook I defined the receiver operating characteristic (ROC) curve. The following figure shows ROC curves for male and female defendants:

from rcs_utils import plot_rocplot_roc(table_male)plot_roc(table_female)_images/6c3b663c6aee1331f0bd08cae9c973892e52b264950db4baa0d3db68ea2dc790.png

The ROC curves are nearly identical, which implies that it is possible to calibrate COMPAS equally for male and female defendants.

Summary

With respect to sex, COMPAS is fair by the criteria posed by the ProPublica article: it has the same error rates for groups with different prevalence. But it is unfair by the criteria of the WaPo article, which argues:

A risk score of seven for black defendants should mean the same thing as a score of seven for white defendants. Imagine if that were not so, and we systematically assigned whites higher risk scores than equally risky black defendants with the goal of mitigating ProPublica’s criticism. We would consider that a violation of the fundamental tenet of equal treatment.

With respect to male and female defendants, COMPAS violates this tenet.

So who’s right? We have two competing definitions of fairness, and it is mathematically impossible to satisfy them both. Is it better to have equal error rates for all groups, as COMPAS does for men and women? Or is it better to be calibrated, which implies equal predictive values? Or, since we can’t have both, should the test be “tempered”, allowing both error rates and predictive values to depend on prevalence?

In the next notebook I explore these trade-offs in more detail. And I summarized these results in Chapter 9 of Probably Overthinking It.

 •  0 comments  •  flag
Share on Twitter
Published on January 20, 2025 08:10

January 3, 2025

Confidence In the Press


This is the fifth in a series of excerpts from Elements of Data Science, now available from Lulu.com and online booksellers. It’s based on Chapter 16, which is part of the political alignment case study. You can read the complete example here, or run the Jupyter notebook on Colab.


Because this is a teaching example, it builds incrementally. If you just want to see the results, scroll to the end!


Chapter 16 is a template for exploring relationships between political alignment (liberal or conservative) and other beliefs and attitudes. In this example, we’ll use that template to look at the ways confidence in the press has changed over the last 50 years in the U.S.

The dataset we’ll use is an excerpt of data from the General Social Survey. It contains three resamplings of the original data. We’ll start with the first.

datafile = "gss_pacs_resampled.hdf"gss = pd.read_hdf(datafile, "gss0")gss.shape(72390, 207)

It contains one row for each respondent and one column per variable.

Changes in Confidence

The General Social Survey includes several questions about a confidence in various institutions. Here are the names of the variables that contain the responses.

' '.join(column for column in gss.columns if 'con' in column)'conarmy conbus conclerg coneduc confed confinan coninc conjudge conlabor conlegis conmedic conpress conrinc consci contv'

Here’s how this section of the survey is introduced.

I am going to name some institutions in this country. As far as the people running these institutions are concerned, would you say you have a great deal of confidence, only some confidence, or hardly any confidence at all in them?

The variable we’ll explore is conpress, which is about “the press”.

varname = "conpress"column = gss[varname]column.tail()72385 2.072386 3.072387 3.072388 2.072389 2.0Name: conpress, dtype: float64

As we’ll see, response to this question have changed substantiall over the last few decades.

Responses

Here’s the distribution of responses:

column.value_counts(dropna=False).sort_index()1.0 69682.0 244033.0 16769NaN 24250Name: conpress, dtype: int64

The special value NaN indicates that the respondent was not asked the question, declined to answer, or said they didn’t know.

The following cell shows the numerical values and the text of the responses they stand for.

responses = [1, 2, 3]labels = [ "A great deal", "Only some", "Hardly any",]

Here’s what the distribution looks like. plt.xticks puts labels on the

-axis.

pmf = Pmf.from_seq(column)pmf.bar(alpha=0.7)decorate(ylabel="PMF", title="Distribution of responses")plt.xticks(responses, labels);_images/f652e115b3186a827e67d0882df2218fecf3f5466985f3da007a09e983e93aa6.png

About had of the respondents have “only some” confidence in the press – but we should not make too much of this result because it combines different numbers of respondents interviewed at different times.

Responses over time

If we make a cross tabulation of year and the variable of interest, we get the distribution of responses over time.

xtab = pd.crosstab(gss["year"], column, normalize="index") * 100xtab.head()conpress1.02.03.0year197322.69647762.39837414.905149197424.84683555.75221219.400953197523.92807758.16044317.911480197629.32330853.58851717.088175197724.48436559.14837016.367265

Now we can plot the results.

for response, label in zip(responses, labels): xtab[response].plot(label=label)decorate(xlabel="Year", ylabel="Percent", title="Confidence in the press")_images/1ccb963b40a30d032cdec783f0da628e36b802486bacad5d0d9217c581e0a275.png

The percentages of “A great deal” and “Only some” have been declining since the 1970s. The percentage of “Hardly any” has increased substantially.

Political alignment

To explore the relationship between these responses and political alignment, we’ll recode political alignment into three groups:

d_polviews = { 1: "Liberal", 2: "Liberal", 3: "Liberal", 4: "Moderate", 5: "Conservative", 6: "Conservative", 7: "Conservative",}

Now we can use replace and store the result as a new column in the DataFrame.

gss["polviews3"] = gss["polviews"].replace(d_polviews)

With this scale, there are roughly the same number of people in each group.

pmf = Pmf.from_seq(gss["polviews3"])pmf.bar(color="C1", alpha=0.7)decorate( xlabel="Political alignment", ylabel="PMF", title="Distribution of political alignment",)_images/af209238e95c5c1c14543db06c71d0c98aa336c5bee1a163552a20b8eeb44fbe.pngGroup by political alignment

Now we can use groupby to group the respondents by political alignment.

by_polviews = gss.groupby("polviews3")

Here’s a dictionary that maps from each group to a color.

muted = sns.color_palette("muted", 5)color_map = {"Conservative": muted[3], "Moderate": muted[4], "Liberal": muted[0]}

Now we can make a PMF of responses for each group.

for name, group in by_polviews: plt.figure() pmf = Pmf.from_seq(group[varname]) pmf.bar(label=name, color=color_map[name], alpha=0.7) decorate(ylabel="PMF", title="Distribution of responses") plt.xticks(responses, labels)_images/4ba4c1231ab8c9458e69e622f31289dab6b1173ac7b1e91678fe140d8d3dec7b.png_images/ec0ce0935368ef84d8f055c1270e9705607f398028601094b01e686f9e9eebe0.png_images/aced56562db7764eda5e7d9f4d87f680fb71cac3f44983280ab873aa3ab48be8.png

Looking at the “Hardly any” response, it looks like conservatives have the least confidence in the press.

Recode

To quantify changes in these responses over time, one option is to put them on a numerical scale and compute the mean. Another option is to compute the percentage who choose a particular response or set of responses. Since the changes have been most notable in the “Hardly any” response, that’s what we’ll track. We’ll use replace to recode the values so “Hardly any” is 1 and all other responses are 0.

d_recode = {1: 0, 2: 0, 3: 1}gss["recoded"] = column.replace(d_recode)gss["recoded"].name = varname

We can use value_counts to confirm that it worked.

gss["recoded"].value_counts(dropna=False)0.0 31371NaN 242501.0 16769Name: conpress, dtype: int64

Now if we compute the mean, we can interpret it as the fraction of respondents who report “hardly any” confidence in the press. Multiplying by 100 makes it a percentage.

gss["recoded"].mean() * 10034.833818030743664

Note that the Series method mean drops NaN values before computing the mean. The NumPy function mean does not.

Average by group

We can use by_polviews to compute the mean of the recoded variable in each group, and multiply by 100 to get a percentage.

means = by_polviews["recoded"].mean() * 100meanspolviews3Conservative 44.410101Liberal 27.293806Moderate 34.113831Name: conpress, dtype: float64

By default, the group names are in alphabetical order. To get the values in a particular order, we can use the group names as an index:

groups = ["Conservative", "Moderate", "Liberal"]means[groups]polviews3Conservative 44.410101Moderate 34.113831Liberal 27.293806Name: conpress, dtype: float64

Now we can make a bar plot with color-coded bars:

title = "Percent with hardly any confidence in the press"colors = color_map.values()means[groups].plot(kind="bar", color=colors, alpha=0.7, label="")decorate( xlabel="", ylabel="Percent", title=title,)plt.xticks(rotation=0);_images/9582dd9bf6d69f9295d546ba69db763928bf20cdf27d8dab7a08f64d33fb7f49.png

Conservatives have less confidence in the press than liberals, and moderates are somewhere in the middle.

But again, these results are an average over the interval of the survey, so you should not interpret them as a current condition.

Time series

We can use groupby to group responses by year.

by_year = gss.groupby("year")

From the result we can select the recoded variable and compute the percentage that responded “Hardly any”.

time_series = by_year["recoded"].mean() * 100

And we can plot the results with the data points themselves as circles and a local regression model as a line.

plot_series_lowess(time_series, "C1", label='')decorate( xlabel="Year", ylabel="Percent", title=title)_images/82515fa4a863c7a0e972d039c05e2ab431d278a1ec98601173f492d03b020d74.png

The fraction of respondents with “Hardly any” confidence in the press has increased consistently over the duration of the survey.

Time series by group

So far, we have grouped by polviews3 and computed the mean of the variable of interest in each group. Then we grouped by year and computed the mean for each year. Now we’ll use pivot_table to compute the mean in each group for each year.

table = gss.pivot_table( values="recoded", index="year", columns="polviews3", aggfunc="mean") * 100table.head()polviews3ConservativeLiberalModerateyear197422.48243617.31207316.604478197522.33502510.88435417.481203197619.49541317.79448614.901257197722.39819013.20754714.650767197827.17622118.04878016.819013

The result is a table that has years running down the rows and political alignment running across the columns. Each entry in the table is the mean of the variable of interest for a given group in a given year.

Plotting the results

Now let’s see the results.

for group in groups: series = table[group] plot_series_lowess(series, color_map[group]) decorate( xlabel="Year", ylabel="Percent", title="Percent with hardly any confidence in the press",)_images/b206bc96158affd5c576feea8e214bc24069a26028e343a303b0d987b6581c99.png

Confidence in the press has decreased in all three groups, but among liberals it might have leveled off or even reversed.

Resampling

The figures we’ve generated so far in this notebook are based on a single resampling of the GSS data. Some of the features we see in these figures might be due to random sampling rather than actual changes in the world. By generating the same figures with different resampled datasets, we can get a sense of how much variation there is due to random sampling. To make that easier, the following function contains the code from the previous analysis all in one place.

def plot_by_polviews(gss, varname): """Plot mean response by polviews and year. gss: DataFrame varname: string column name """ gss["polviews3"] = gss["polviews"].replace(d_polviews) column = gss[varname] gss["recoded"] = column.replace(d_recode) table = gss.pivot_table( values="recoded", index="year", columns="polviews3", aggfunc="mean" ) * 100 for group in groups: series = table[group] plot_series_lowess(series, color_map[group]) decorate( xlabel="Year", ylabel="Percent", title=title, )

Now we can loop through the three resampled datasets and generate a figure for each one.

datafile = "gss_pacs_resampled.hdf"for key in ["gss0", "gss1", "gss2"]: df = pd.read_hdf(datafile, key) plt.figure() plot_by_polviews(df, varname)_images/b206bc96158affd5c576feea8e214bc24069a26028e343a303b0d987b6581c99.png_images/60b020b2033eee5acf7c98ef76f7fb2346cd4ec9f5c4aaa680f30cb77e362664.png_images/5e9b362b7385c42b2e13cb9bbf57b15aafb71cdd804e687f8eeaadb98f2b6f1f.png

If you see an effect that is consistent in all three figures, it is less likely to be due to random sampling. If it varies from one resampling to the next, you should probably not take it too seriously.

Based on these results, it seems likely that confidence in the press is continuing to decrease among conservatives and moderates, but not liberals – with the result that polarization on this issue has increased since the 1990s.

 •  0 comments  •  flag
Share on Twitter
Published on January 03, 2025 16:03

Probably Overthinking It

Allen B. Downey
Probably Overthinking It is a blog about data science, Bayesian Statistics, and occasional other topics.
Follow Allen B. Downey's blog with rss.