Jump to ratings and reviews
Rate this book

Data Science Ethics: Concepts, Techniques, and Cautionary Tales

Rate this book
Data science ethics is all about what is right and wrong when conducting data science. Data science has so far been primarily used for positive outcomes for businesses and society. However, just as with any technology, data science has also come with some negative an increase of privacy invasion, data-driven discrimination against sensitive groups, and decision making by complex models without explanations.

While data scientists and business managers are not inherently unethical, they are not trained to weigh the ethical considerations that come from their work - Data Science Ethics addresses this increasingly significant gap and highlights different concepts and techniques that aid understanding, ranging from k-anonymity and differential privacy to homomorphic encryption and zero-knowledge proofs to address privacy concerns, techniques to remove discrimination against sensitive groups, and various explainable AI techniques.

Real-life cautionary tales further illustrate the importance and potential impact of data science ethics, including tales of racist bots, search censoring, government backdoors, and face recognition. The book is punctuated with structured exercises that provide hypothetical scenarios and ethical dilemmas for reflection that teach readers how to balance the ethical concerns and the utility of data.

272 pages, Paperback

Published June 24, 2022

1 person is currently reading
24 people want to read

About the author

David Martens

23 books1 follower

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
8 (80%)
4 stars
2 (20%)
3 stars
0 (0%)
2 stars
0 (0%)
1 star
0 (0%)
Displaying 1 - 5 of 5 reviews
1 review
June 14, 2022
Very useful overview of the current landscape in data science ethics. I especially liked the easy to grasp examples in every chapter. This is a recommended read for everyone in the data science industry (and even for those outside)!
Profile Image for Minh Nguyen.
103 reviews2 followers
May 16, 2022
Meticulously researched (~500 references) and well written, this book is an essential read for any data scientists or their managers. It is very easy to read if you skip unfamiliar technical stuff, so I would like to recommend this book to business people who work with data scientists or manage data-driven businesses as well.

Data privacy and ethics has become more important than ever before. The lack of ethics in the data science process can affect the business to fail or cause huge legitimate costs. Martens proposed a framework for data science ethics which is called the FAT flow (Fair, Accountable and Transparent). These three aspects of the data science ethics are incorporated in each stage of the data science flow: data gathering, preprocessing, modeling, evaluation and deployment. Many techniques and concepts regarding data privacy has been well documented, such as data encryption, hashing, k-anonymized, l-diversity, t-closeness, re-identification, differential privacy, zero-knowledge proof, federated learning, explainable AI, p-hacking and many more. I have been using many of those techniques at work, but sometimes I found hard to explain those concepts to my colleagues. Martens’s explanation and examples are really readable for analysts or business people who have basic knowledge of data science.

My personal perspective to the FAT framework is that I would like to add another letter to it, L for Lawfulness. So it would be the FLAT framework which consists of Fairness, Lawfulness, Accountability and Transparency. With the GDPR, CCPA and the equivalent APPI law in Japan being enforced and with the upcoming rules against AI, data science ethics is a hot topic for every business!
Profile Image for Vinayak.
2 reviews
June 9, 2022
Data Science Ethics is an excellent primer on the ethical issues and challenges associated with deploying machine learning systems to solve real-world problems. Professor Martens has done a masterful job on his research and explained complex concepts in understandable and entertaining prose.

The examples of real-time failures (e.g. COMPAS, Cambridge Analytica, Apple Card) highlight the need for data scientists to be vigilant of potential misuses and unintended consequences of machine learning systems. Chapter 4 introduces the reader to the emerging technologies (e.g. differential privacy, zero-knowledge proof, homomorphic encryption, secure multi-party communication, federated learning) that can be used to better protect data privacy and reduce bias.

As a long-time data scientist who has had to deal with some of these issues, I wish this book had been written years ago, but am grateful that it is available now.
Profile Image for Konstantina.
1 review1 follower
January 3, 2025
This book offers an enlightening exploration of the ethical dimensions of data science by blending theoretical insights with practical examples. I particularly appreciated Aristotle's idea of equilibrium as a metaphor to emphasize balance in ethical decision-making. The examples in the book are both relatable and thought-provoking, making abstract principles easy to grasp and apply to real-world scenarios.
This book is a must-read for anyone curious about the intersection of ethics and data science. It left me more thoughtful about the responsibilities of data practitioners and the societal impact of their work.
278 reviews
November 21, 2023
Nice introduction, clear structure, good examples.
Scanned most of the (few) mathematics, but overall very informative.
Displaying 1 - 5 of 5 reviews

Can't find what you're looking for?

Get help and learn more about the design.