Jump to ratings and reviews
Rate this book

Analyzing Baseball Data with R

Rate this book
“Our community has continued to grow exponentially, thanks to those who inspire the next generation. And inspiring the next generation is what the authors of Analyzing Baseball Data with R are doing. They are setting the career path for still thousands more. We all need some sort of kickstart to take that first or second step. You may be a beginner R coder, but you need access to baseball data. How do you access this data, how do you manipulate it, how do you analyze it? This is what this book does for you. But it does more, by doing what sabermetrics does it asks baseball questions. Throughout the book, baseball questions are asked, some straightforward, and others more thought-provoking.”

From the Foreword by Tom Tango

Analyzing Baseball Data with R Third Edition introduces R to sabermetricians, baseball enthusiasts, and students interested in exploring the richness of baseball data. It equips you with the necessary skills and software tools to perform all the analysis steps, from importing the data to transforming them into an appropriate format to visualizing the data via graphs to performing a statistical analysis.

The authors first present an overview of publicly available baseball datasets and a gentle introduction to the type of data structures and exploratory and data management capabilities of R. They also cover the ggplot2 graphics functions and employ a tidyverse-friendly workflow throughout. Much of the book illustrates the use of R through popular sabermetrics topics, including the Pythagorean formula, runs expectancy, catcher framing, career trajectories, simulation of games and seasons, patterns of streaky behavior of players, and launch angles and exit velocities. All the datasets and R code used in the text are available for download online.

New to the third edition is the revised R code to make use of new functions made available through the tidyverse. The third edition introduces three chapters of new material, focusing on communicating results via presentations using the Quarto publishing system, web applications using the Shiny package, and working with large data files. An online version of this book is hosted at

Kindle Edition

Published August 1, 2024

4 people are currently reading
21 people want to read

About the author

Jim Albert

26 books1 follower
Jim Albert is a Distinguished University Professor of Statistics at Bowling Green State University. His research interests include Bayesian modeling and applications of statistical thinking in sports. He has authored or coauthored several books including Ordinal Data Modeling, Bayesian Computation with R, and Workshop Statistics: Discovery with Data, A Bayesian Approach.

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
8 (53%)
4 stars
5 (33%)
3 stars
1 (6%)
2 stars
1 (6%)
1 star
0 (0%)
Displaying 1 of 1 review
Profile Image for Aaron.
205 reviews1 follower
June 29, 2025
This is, by far, the best textbook on sabermetrics (baseball statistics) available right now, in my opinion. The third edition, released in late 2024, covers a wide range of fascinating topics—catcher framing, Statcast data analysis, career trajectories, and more—accompanied by R code you can run on your own machine.

The writing is clear, the editing is solid, and the R code makes great use of the tidyverse, especially ggplot2. The book features a number of unique, visually striking plots that make the analysis even more engaging. I just love this book.
Displaying 1 of 1 review

Can't find what you're looking for?

Get help and learn more about the design.