Visualizing Data is about visualizationtools that provide deep insight into thestructure of data. There are graphicaltools such as coplots, multiway dot plots,and the equal count algorithm. There arefitting tools such as loess and bisquarethat fit equations, nonparametric curves,and nonparametric surfaces to data.But the book is much more than just acompendium of useful tools. It conveys astrategy for data analysis that stressesthe use of visualization to thoroughlystudy the structure of data and to checkthe validity of statistical models fittedto data. The result of the tools and thestrategy is a vast increase in what you canlearn from your data. The book demonstratesthis by reanalyzing many data sets from thescientific literature, revealing missedeffects and inappropriate models fittedto data.
TODO full review: + Nearly a decade after his classic book The Elements of Graphing Data, William S. Cleveland returns to Visualizing Data. This 1993 book is still well worth its time for the starting practitioner. Complements Stephen Few, Dana M. Wong, and (the best I found among the group of authors focusing on the basics) Nathan Yau with the technical (read: mathematical) details, but does not (and it cannot) have details regarding the modern software to create the plots. +/- Covers various aspects of drawing, including touching ("brushing") an image to add labels for key points, marking ("slicing") specific areas of the plot, zooming in, and changing the aspect ratio ("banking"). The terms proposed by Cleveland have not passed the test of time, and the methods proposed here are still tentative. ++ Plenty of good material on Q-Q plots, box plots, distribution fits and residuals, curve fitting (all sorts of parametric fitting, plus LO[W]ESS), scatterplots, higher variate analysis (tri- and multi-, with coplots, level plots, contour plots, scatterplot matrices, and even the dreaded 3d[-to-2d] plots). +/- Quite a bit of material from John W. Tukey's Exploratory Data Analysis, but summarized well and explained for the beginner.
A tough yet necessary read for the non-statistician... At a time when anyone can produce colorful graphs in a few clicks, this book tells us how much thought, work, and hard-earned technique must go into plotting data, to reveal rather than distort the trends it conceals.
I bought this book years before I got around to reading it. And I had expected a very different book than what I had. If I had known this isn't a theoretical book, I probably would have read it much sooner.
Visualizing Data is very much like Exploratory Data Analysis. Cleveland simply shows how he would analyze various datasets, starting with single variable datasets and continuing to what he calls hypervariate data (defined as "more than three variables"). Cleveland does a wonderful job presenting his visual techniques, and explains things in great detail. Unfortunately, he doesn't explain much of the math or vocabulary he uses. Ultimately, you can learn visual analysis from this book, but you may need another book -- such as Exploratory Data Analysis in order to follow along.
This is really great. It's not mathematically taxing, and it's certainly not a "definition, theorem, proof" book, but it doesn't intend to be that. It shows how one can visualize probablity distributions, including joint probably distributions, and extract information from them graphically. Both numeric and graphical statistical inference are important, but this is the first book I've read with a graphical aspect.