How to make simple sense of complex statistics--from the author of "Numbers Rule Your World" We live in a world of Big Data--and it's getting bigger every day. Virtually every choice we make hinges on how someone generates data . . . and how someone else interprets it--whether we realize it or not.Where do you send your child for the best education? Big Data. Which airline should you choose to ensure a timely arrival? Big Data. Who will you vote for in the next election? Big Data.The problem is, the more data we have, the more difficult it is to interpret it. From world leaders to average citizens, everyone is prone to making critical decisions based on poor data interpretations.In "Numbersense," expert statistician Kaiser Fung explains when you should accept the conclusions of the Big Data "experts"--and when you should say, "Wait . . . what?" He delves deeply into a wide range of topics, offering the answers to important questions, such How does the college ranking system really work? Can an obesity measure solve America's biggest healthcare crisis? Should you trust current unemployment data issued by the government? How do you improve your fantasy sports team? Should you worry about businesses that track your data?Don't take for granted statements made in the media, by our leaders, or even by your best friend. We're on information overload today, and there's a lot of bad information out there."Numbersense" gives you the insight into how Big Data interpretation works--and how it too often doesn't work. You won't come away with the skills of a professional statistician. But you will have a keen understanding of the data traps even the best statisticians can fall into, and you'll trust the mental alarm that goes off in your head when something just doesn't seem to add up.Praise for "Numbersense"""Numbersense" correctly puts the emphasis not on the size of big data, but on the analysis of it. Lots of fun stories, plenty of lessons learned--in short, a great way to acquire your own sense of numbers!"Thomas H. Davenport, coauthor of "Competing on Analytics" and President's Distinguished Professor of IT and Management, Babson College"Kaiser's accessible business book will blow your mind like no other. You'll be smarter, and you won't even realize it. Buy. It. Now."Avinash Kaushik, Digital Marketing Evangelist, Google, and author, "Web Analytics 2.0""Each story in "Numbersense" goes deep into what you have to think about before you trust the numbers. Kaiser Fung ably demonstrates that it takes skill and resourcefulness to make the numbers confess their meaning."John Sall, Executive Vice President, SAS Institute"Kaiser Fung breaks the bad news--a ton more data is no panacea--but then has got your back, revealing the pitfalls of analysis with stimulating stories from the front lines of business, politics, health care, government, and education. The remedy isn't an advanced degree, nor is it common sense. You need "Numbersense.""Eric Siegel, founder, Predictive Analytics World, and author, "Predictive Analytics"
Q: Survey Survival Game, Secret Pacts, and Aided Recall (c)
Q: Each chapter is inspired by a recent news item in which someone made a claim and backed it up with data. I show how I validated these assertions, by asking incisive questions, by checking consistency, by quantitative reasoning, and sometimes, by procuring and analyzing relevant data. Does Groupon’s business model make sense? Will a new measure of obesity solve our biggest health crisis? Was Claremont McKenna College a small-time cheat in the school ranking game? Is government inflation and unemployment data trustworthy? How do we evaluate performance in fantasy sports leagues? Do we benefit when businesses personalize marketing tactics by tracking our activities? (c) I couldn't characterize this book any better.
Q: NUMBERSENSE is the one quality that I desire the most when hiring a data analyst; it separates the truly talented from the merely good. I typically look for three things, the other two being technical ability and business thinking. One can be a coding wizard but lacks any NUMBERSENSE. One can be a master storyteller who can connect the dots but lacks any NUMBERSENSE. NUMBERSENSE is the third dimension. NUMBERSENSE is that noise in your head when you see bad data or bad analysis. It’s the desire and persistence to get close to the truth. It’s the wisdom of knowing when to make a U-turn, when to press on, but mostly when to stop. It’s the awareness of where you came from, and where you’re going. It’s gathering clues, and recognizing decoys. (c) Q: In analyzing data, there is no way to avoid having theoretical assumptions. Any analysis is part data, and part theory. Richer data lends support to many more theories, some of which may contradict each other, as we noted before. But richer data does not save bad theory, or rescue bad analysis. The world has never run out of theoreticians; in the era of Big Data, the bar of evidence is reset lower, making it tougher to tell right from wrong. (c) Q: When more people are performing more analyses more quickly, there are more theories, more points of view, more complexity, more conflicts, and more confusion. There is less clarity, less consensus, and less confidence. (c) Q: More data inevitably results in more time spent arguing, validating, reconciling, and replicating. All of these activities create doubt and confusion. There is a real danger that Big Data moves us backward, not forward. It threatens to take science back to the Dark Ages, as bad theories gain ground by gathering bad evidence and drowning out good theories. (c) Q: One aspect of the Wolverine Scholars Program was curious, and immediately stirred much index-finger-wagging in the boisterous law-school blogosphere: The applicants do not have to submit scores from the Law School Admission Test (LSAT), a standard requirement of every applicant to Michigan and most other accredited law schools in the nation. Even more curiously, taking the LSAT is a cause for disqualification. (c) Q: With this in mind, we play Admissions Dean for a day. Not any Admissions Dean but the most cynical, most craven, most calculating Dean of an elite law school. We use every trick in the book, we leave no stones unturned, and we take no prisoners. The U.S. News ranking is the elixir of life; nothing else matters to us. It’s a dog-eat-dog world: If we don’t, our rival will. We are going upstream, so that standing still is rolling backwards. (c) Shaking in my boots. Me. With laughter. Q: The design of the surveys is puzzling. Why do they expect the administrators of one school or the partners of one law firm to have panoramic vision of all 200 law schools? The rate of response for the professional survey is low, below 15 percent, and the survey sample is biased as it is derived from the Top Law Firms ranked by none other than U.S. News. ... such grumbling is pointless, and has proven futile against the potent marketing machine of U.S. News. The law school ranking, indeed any kind of subjective ranking, does not need to be correct; it just has to be believed. (c) Q: In our time, we have come to adopt all types of rating products with flimsy scientific bases; we don’t think twice while citing Nielsen television ratings, Michelin ratings for restaurants, Parker wine ratings, and lately, the Klout Score for online reputation. (c) Q: A job is a job is a job. Not everyone can be an associate in Big Law. We tally up all jobs, part-time as well as fulltime, temporary as well as permanent, at big shops as well as at mom-and-pop firms, those requiring Bar passage as well as those that don’t. Blending frappuccinos at Starbucks, selling T-shirts at American Apparel, delivering standup comedy at the local bar: These are all legitimate jobs. We call up our friends in high places, courthouses for instance, and arrange for short-term apprenticeships, funded by the law school, of course. In case that’s not enough, we hire from within. Our research labs, our libraries, and our dining halls can take extra help. Surely, creating jobs for downtrodden students saddled with unsustainable debt is the morally right thing to do. Let’s offer temporary positions to one batch of students at graduation, before they fill out the first survey. After six months, we shift the jobs to a second group, in ample time for the second survey. (c)
This book started out well, but by the end I was glad to be done.
Annoyances? Sure.
For starters, the book's title is wrong. This is not about how to use big data to your advantage, and while there are a few good pieces of advice for dealing with data in general, there's very little "big" here. The companies mentioned use big data, the author doesn't, or at least not in this book.
Here's another: Some of the chapters drag on way to long (especially for such a slim book), and go into minutia about the players his friends picked in fantasy football.
One more: The author repeatedly throws in the title as catchphrase, in smallcaps (NUMBERSENSE) in a relentless bid to coin this in a kind of "freakanomics" way, and he plugs he previous book half a dozen times.
This last annoyance I can blame of the editor (or publisher, McGraw Hill), and the title may be their fault too, though these things still pulls the rating down a full star for me.
The good? The first chapter gives a stunning and detailed exposition on how law schools tweak their data to do better in the rankings. There are a couple other
Other good points: Description of the grunt work that is massaging data, reminder that all big data sets are tweaked, cleaned, and assumptions have been made before analysis can begin. There are nice illustrations of some basic concepts in stats, like Simpson's paradox and the importance of the counterfactual.
I really loved this book, which gives you an introduction to Numbersense - essentially when anyone gives you stats (whether it's the unemployment rate, national inflation rate, or some margin of success or profitability) you ask the right questions behind the data. When raw data is manipulated or big data is being used to ones advantage (like a college admissions department) you'll be able to draw conclusions yourself and get more of the truth - not just what is being said.
Many people seem disappointed this book didn't teach them practically how to "use big data to their advantage", but each scenario was an example of this practice. The books goal is to give you Numbersense - something more valuable than manipulating data. Spotting numbers that feel wrong and asking the right questions to see where they're coming from is an important skill in a world that runs on numbers.
“The key is not how much data is analyzed, but how.”
Data is manipulatable. The same set of data can be analyzed to give the exact polar results. With the accessibility of the Internet, we are living in a world of lots of data. “Big Data” is the word the author used. It’s a vast number of data that’s beyond the scope of any normal data analysis program can handle or manage. Lots of data are obtainable, with lots of analyses of these data available, since every single one of the market players are studying these data to gain an edge in competition.
The author used the Gates Foundation’s example to let us know that even big organizations with lots of money and analysts can still make a stupid decision with the wrong data or analysis. Ten years ago, the Foundation made a mistake assuming that smaller school s are better for student achievement, which is later proven untrue. He argued that Big Data moves us backwards, since more data results in more time spend analyzing, arguing, validating and replicating results. More of the any above activities will cause more doubt and confusion. Therefore, It’s urgent to learn a way to analyze them so you can just keep your head clear, and not being lied to.
“Any kind of subjective ranking does not need to be correct, it just has to be believed”
What do we believe, and what technique do we use to help us make the decision? Data analysis is an art, and not every statistician knows what he’s talking about. A person with good “numbersense” will be way above the others in avoiding the pitfall. A person with a good numbersense will spot bad data or bad analyses, or know when to stop when collecting his own. Unfortunately, numbersense can’t be taught in a regular classroom, a program or a textbook. It’s only learned from another person or real life practices. After more than 20 years in management in a hospital, I know these people do exist, but rarely. They are wonderful problem solvers. Lucky for rest of us, this book is a great place to start learning about numbersense. The author has a way of explaining complex subjects in a simple and understandable way, and his flow of thoughts is logical and very easy to follow. While analyzing data, the author also explained statistical terms thoroughly, as the term significant does not necessarily means important.
The author used real life news examples where someone made a claim about something and then backed it up with data, and he analyzes them, explaining the process to us along the way. The examples include: Law schools admission data, Groupon’s business model, diets and BMI, unemployment and jobs, our inability to remember prices and CPI, and even fantasy football. These examples were very interesting to read as the author gives step-by-step instructions of how these data we see everyday could easily be manipulated to fool us. My daughter is in the process of applying for college, and I can assure you, after reading the first chapter, I will never look at college rankings the same way.
I think every person in marketing, business, sociology, management or data analysis should read this book, as well as any consumer who wants to make sense of this so called “Big Data.” Numbersense is a great word for people who have the talent of analyzing data and spotting errors or intended manipulation. This book reads very much like Freakonomics: A Rogue Economist Explores the Hidden Side of Everything by Stephen Levitt, but is a bit more technical and might take a little understanding of statistics and/or business to fully appreciate the book. My background is business administration and healthcare, and I had a fun ride.
*Thanks to Netgalley and McGraw Hill in providing the advanced reading copy.
I was really drawn to this non-fiction selection by Fung because I work in the IT industry and my specialty is analytics. Numbersense promises to be the book that reconciles Big Data and business decisions, guiding readers into harnessing data to answer important questions. While I found the book to be well written and technically accurate, it left me with a bit of confusion. Fung spends a good portion of the book illustrating how data scientists in the pocket of marketers can manipulate the story told by the underlying raw data through careful selection and application of specific statistical procedures and quantifications that don’t lie per se but simply give impressions that other procedures and quantifications might contradict. Lesson learned: Don’t just trust the data scientists; always, always have a look at the raw data and review the statistical methods used so that you can get the whole picture instead of just what the data scientists presenting the data want you to see. Ok, good. But Fung goes on to detail in later chapters examples in which raw data, on the whole, turns out to be entirely misleading because there is too much “noise” in the quantifications to get a realistic understanding of the relationships between the variables. In these examples, he shows readers clearly that actionable information can only be extracted from the raw data by carefully selecting and applying the best statistical procedures and quantifications for the given questions we are trying to ask of our data. Lesson learned: sometimes looking at the raw data is not helpful at all and you need to rely on skilled data scientists to select the right procedures and quantifications to make sense of the data. And now we are left with two lessons that are in potential conflict. Which of course begs the question, how are members of the intended audience – business folks without a deep statistical background- supposed to know whether the raw data is:
A. going to be useful in helping us determine whether our data scientists are clever little devils gaming us
or
B. too scary and noisy to tell us anything unfiltered and we need to trust our data scientists to intelligently apply the “right” procedures and quantifications to make sense of it all for us
If Fung, who is by all impressions, a brilliant thinker and writer, can address this question in a revision of the original text, I’d be much more comfortable reclassifying Numbersense as a handy go-to guide on making sense of Big Data instead of a light and interesting read on some fascinating ways people in marketing, sports, and academia have manipulated data to their advantage.
A big big topic nowadays is Big Data (I almost watched an episode on this in Genzai Closeup - it's this Japanese show). So when I saw that this book is a way to make sense of big data, the nerd in me jumped out and clicked the "request" button. Good thing I got it, my nerd doesn't come out often enough!
Numbersense uses a number of difference scenarios, from as law school admissions to fantasy football, and shows us how data can be manipulated and how to see past the manipulations.
I'm not a numbers kind of girl, but I found this book to be easy to understand. The tone is friendly (without being condescending), and somehow, Kaiser Fung has a knack for explaining things simple.
Plus, after reading the first chapter, I'm kind of worried about the whole admissions to university thing. It's actually quite interesting to think about this chapter while bearing in mind the stuff written about in College (Un)bound (link reads to review). College (Un)bound, just to refresh your memory, is this book that I reviewed recently which looked at the necessity of college. One chunk of the book looks at which college are suitable for a particular student, and when tied in to admissions, well, it's really interesting.
For some reason, I felt like the chapter on admissions was the longest. After I finished reading that chapter, it seemed like everything else flew by. I'm not sure if it's some kind of skewed perception of time, but I do wonder if anyone else felt this way.
By the way, I think it's possible to read this book in any order, provided you read it a chapter at a time. Since they deal with unrelated topics, it's possible to read it in order of interest and still understand everything.
All in all, an interesting look at how people use data to lie to us and how we can see through the lies.
Disclaimer: I got a free copy of this book via NetGalley in exchange for a free and honest review.
I don't know what I was expecting when I picked up the book. I knew it was a light read, but honestly I expected him to reveal a few of his scientific techniques of dealing with numbers. Definitely not an expose~ of the various times researchers quoted the numbers that would validate their theories. But still I liked it. Because the author has done a lot of research to pull out counter researches and counter counter researches. Also I'm not sure why but it may be to attract people to pick up the book, the author has overused and hence abused the most hyped words "Big Data" a lot in it without really going in to what it really is. My take on it: if you wanna learn about data sciences or big data, pick up a heavier book. This just won't cut it. But, if you're looking for conspiracy theories backed by data, that you can tell your friends when sitting around a campfire, this books has ample meat in it. Bonus points for the book for his writing style!!!
I thought this book would be more interesting but it wasn't really about big data. There's some decent analysis and even though I've never played fantasy (fairy tale) football the ending chapter on that was pretty good about demonstrating what you can do with data analytics. But most of the chapters droned on. The Epilogue story about problems in data manipulation when moving data from one database to another was funny since I've had to do text file manipulation in the past but it only showed that the author doesn't know how to get down and dirty quickly to get things done. And the second story ended up being about wasting time manually categorizing search terms and ended up abruptly. Sometimes authors peak with their first book. I hope Fung's second is much better and there's lot of room for that to happen.
The title was misleading as most of the content was about basic statistics and how data and reports can be misinterpreted or sliced and diced to make a company or economy look better than about big data itself. That said, the author is right in stating that people should not take data analysis at face value but should dig deeper, question how the data was collected and analyzed and interpreted, and ask what would happen if some action was not taken.
This book reoriented me to the significance of data interpretation. Richer data sets do not save bad theory or rescue bad analysis. Managing data (in a world of big data / information overload) is one of the many takeaways from the book. This means not falling into data traps and having that awareness when something just doesn't seem right in the data.
Casual ways of telling people about what is Big Data about. It is more on underlying assumptions and interpretation of the data make it more meaningful, doesn't neccessarily mean we need more data. Therefore it is about Numbersense!
This was an okay book. I have been using big data for a while, and this book only meddled around with some ideas. I wasn't too impressed with it to be honest.
There are so many ways the numbers may be skewed. With the right data transformation, exclusions or imputations, the numbers can be manipulated to tell the story the researcher wants the data to tell. Always check raw data, the assumptions and methods used to transform or normalize the data and the statistical techniques selected to analyze the data.
The book gives many examples of data manipulated for some advantage, law school deans fudge the numbers to get higher law school rankings. Groupon shows the benefit of advertising with them, but Groupon looks at the number as a whole and does not break out current customers that take advantage of the discount from those that are net new customers.
The epilogue shows two data challenges I am very familiar with. To bad he does not have any quick fix for these: How do you get one system to accept the dates from another system as a date variable, not text or numeric? How do we categorize thousands of keywords into useful groups in a reasonable amount of time, especially considering these are always changing?
Fung reminds us that big data has nothing to say about causation, many things are correlated without one causing the other. He also demonstrates how statistical significance does not prove the results are important, tiny numbers with little real impact can be statistical significant.
Overall, I think the book was a good read. It had great examples for social data, marketing data, economics data and fantasy football.
Nothing really mind blowing in this book. It's an easy read though with a couple of good nuggets here and there. If i could give it two and a half stars I would because it was average at best. I also dont feel like the book accomplished its goal. If you truly want to become numerate, there are better books out there such as Lies, Damned Lies, and Statistics. The part of using "big data to your advantage" is a bit deceiving as well. Personally, I don't believe some of the topics he discusses fall under the umbrella of big data, such as fantasy football. You're not going to use (or need) terabytes of data to perform fantasy football analyses.
If you have nothing else to read though and just want to power through a book then this isn't a bad option.
Number sense is a book about data, analysis, and the dangers of reading too much into any data. Data and stats are interesting. They reveal things that are unknown to us. The author through discussions of rankings of universities based on GPAs, the use of BMI as an indicator of obesity and subsequently cardiovascular diseases, the story of Groupon, and the BLS data on unemployment each in a series of chapters delves into the need to be careful in analyzing data, consider counterfactuals, be aware of biases, etc. The target story about predicting pregnancy even before the target shopper knows herself actually depends on the predictive model developed based on several assumptions. good read when you have more than an hour commute every day !.
In great detail author Kaiser Fung gives us a picture of how numbers and statistics are skewed to meet the objectives of marketing, public relations, corporate America, government and anyone else who is trying to make a point. You'll learn about Groupon, the price of eggs, jobless figures, law schools, Target, dieting, and more. Numbersense is specific, detailed, and thorough. When you are finished reading it, hopefully you will ask the "counterfactual" questions whenever you feel that you are encountering another attempt to fool you into believing what somebody wants you to believe rather than the truth.
Excellent book. This was better than his first book. He had some good examples of how to look and use data, but the best message I got from the book is something contrary to what I previously thought. Bad data can lead to bad analysis and that can lead to bad policy or decisions. That is a good message to keep in mind as we continue to get more data and more tools to produce more analytics.
Nothing fantastically new or wonderful here, but a good, light read for my lunches. The idea of 'Numbersense' is one that I have proclaimed for years and so I would recommend reading this to anyone without a solid background in statistics as a way to understand some of the garbage being produced in the world.