“One of the most exciting developments from the world of ideas in decades, presented with panache by two frighteningly brilliant, endearingly unpretentious, and endlessly creative young scientists.” – Steven Pinker, author of The Better Angels of Our Nature
Our society has gone from writing snippets of information by hand to generating a vast flood of 1s and 0s that record almost every aspect of our lives: who we know, what we do, where we go, what we buy, and who we love. This year, the world will generate 5 zettabytes of data. (That’s a five with twenty-one zeros after it.) Big data is revolutionizing the sciences, transforming the humanities, and renegotiating the boundary between industry and the ivory tower.
What is emerging is a new way of understanding our world, our past, and possibly, our future. In Uncharted, Erez Aiden and Jean-Baptiste Michel tell the story of how they tapped into this sea of information to create a new kind of telescope: a tool that, instead of uncovering the motions of distant stars, charts trends in human history across the centuries. By teaming up with Google, they were able to analyze the text of millions of books. The result was a new field of research and a scientific tool, the Google Ngram Viewer, so groundbreaking that its public release made the front page of The New York Times, The Wall Street Journal, and The Boston Globe, and so addictive that Mother Jones called it “the greatest timewaster in the history of the internet.”
Using this scope, Aiden and Michel—and millions of users worldwide—are beginning to see answers to a dizzying array of once intractable questions. How quickly does technology spread? Do we talk less about God today? When did people start “having sex” instead of “making love”? At what age do the most famous people become famous? How fast does grammar change? Which writers had their works most effectively censored by the Nazis? When did the spelling “donut” start replacing the venerable “doughnut”? Can we predict the future of human history? Who is better known—Bill Clinton or the rutabaga?
All over the world, new scopes are popping up, using big data to quantify the human experience at the grandest scales possible. Yet dangers lurk in this ocean of 1s and 0s—threats to privacy and the specter of ubiquitous government surveillance. Aiden and Michel take readers on a voyage through these uncharted waters.
I really should only buy books at the airport. I picked up Uncharterd because it had a subtitle with "big data" in it. As my office has started looking at how to mine big data and how to visualize it I thought a "how to" book would help me get moving on understanding how to develop a plan. Well the book failed miserably at that but then that was not it's intent. This is one of the most educational book I have read in years. If I learned nothing I learned that if you have an idea you have to nurture it. Others will say "we don't do that", "you cannot do that" or "that does not exist". They did not have the idea and hence they are not motivated to make it happen. I also learned that when dealing with big data that you follow the data to learn. You might learn early on what you wanted to learn is really not what you needed to learn. That understanding the risk of false positives and other bias is more than important it is absolutely essential. The book reinforces my thought that technology is speeding everything up and making everything new again. People post their lives on twitter, Facebook, instagram and then say they want privacy. It is only going to get worse and I honestly do not think we are going to put it in a bottle and say oops did not mean for that to happen. We have had digital lives for about 25 years now, we have not even seen the "terrible two's" yet. This was a tough book for me only because I kept expect it to deliver a formula which it never did. Instead it delivered thoughts, ideas, perspective and for these reasons this is clearly a book worth reading.
The book was published in 2013, relatively the early days of what has come to be a fairly common buzzword. Therefore, it is probably unfair to expect this book to have the understanding or perspectives that the field has accumulated in the last few years. Having said that, I still think my expectation from the book was higher. It stemmed mostly from the title, and I thought there was tremendous scope there. We now live consume, produce and share tons of data on a daily basis. What could it say about us at a societal level? Wouldn't that be a great way to study how our culture has evolved as a species and perhaps differently in various parts of the world? How do ideas spread, how many of them are universal, and do some have more velocity than others? But hold on. While this book does try to give some answers, it is solely based on the authors' experiments with datasets using Google Books Ngram Viewer. This is a formidable tool - 30 million books digitized by Google. But it is limited too. These are only books published, and a subset of them. Books are only a small representation of culture, and by virtue of publishing being gated (in the past) would carry inherent biases. To be fair, the authors are aware of this and bring it up towards the end. It also raises the concerns that have now grown louder - who owns the data, who has access, what is it being used for? So, if you go by the title, you might be a little disappointed, but it is an interesting story well told and made accessible. It does provide many, many interesting trends and findings across disparate things like technology, popularity, grammar. You would like it especially if you're interested in language - words, their usage, grammar etc.
This was a fun and entertaining read. It starts with a unique set of data becoming available for the first time in human history. As Google started to scan millions of books into its digital library, an opportunity arose to explore new perspectives on the dynamics of cultural evolution over the last couple of centuries. It spawned a new branch of data science: culturomics.
The first part of the book narrates how the authors, two young scientists with a multidisciplinary background, convinced Google to provide access to the data and how they prepared the dataset so as to avoid legal complications and maximise analytic accuracy. Then they discuss how they tackled some tantalising research questions: How does a language grow and develop? How do people become famous? How does censorship work? How does collective memory work? How do technological inventions enter our cultural fabric?
This is quite instructive on different levels. One learns about the framing and operationalisation of tantalising data scientific challenges. Then the insights into the dynamics of cultural evolution are quite fascinating and not at all intuitive. Finally, I was surprised to learn about the startling contributions of many people I had never heard about: George Kingsley Zipf, Kristian Andvord, Hermann Ebbinghaus, Charlotte Salomon, ...
The book is told in a very sober, sympathetic voice. Very easy to follow but not at all condescending. Recommended for people curious about what data science and big data can mean for the way we humans understand ourselves.
the book's theme is "Google books Ngram Viewer" https://books.google.com/ngrams It analyses lots of books (tens millions books) The human being can read several books but it read mathematically millions books. it can lead the relationships of every books. it is the books revolution!
Note: I received an advanced reading copy from the publisher in exchange for an honest review.
I find Google’s Ngram Viewer—a graphing tool that charts the frequency of words/phrases as they occur over time in the books currently digitized by Google—to be addictively fun and fascinating, so I was thrilled to find out that the creators wrote a book about it.
“Uncharted” starts out with an overview of “the natural selection”/”survival of the fittest” of irregular verbs, which leads into the story behind the Ngram Viewer’s development. The rest of the book delves into how studying the rise and fall of word/phrase usage through the Ngram Viewer can reveal things about our history and culture, including insights into fame, censorship, collective memory, and language evolution and growth. Ngram charts are integrated throughout to illustrate specific words/phrases being discussed, and there’s also an appendix of some amusing “Great Battles of History” Ngram charts. The concluding chapter explores possible future impacts and uses of big data.
Overall, an excellent, quick read. I appreciated that the authors present their findings as implications instead of Irrefutable Facts. Not only are they frank about the weaknesses and limitations of the Ngram Viewer, such as how statistical bias, false positives, false negatives, and other data curveballs can skew the charted results, but they also explain how they tried to avoid or diminish deceptive outcomes. The authors’ writing is clear and easy to understand, and has some humor sprinkled in to keep the detailed academic talk from getting too dry for a general audience. I think academic-minded readers would enjoy it too though—it’s a thought-provoking look at history, culture and language through the big data “telescope lens” of the Ngram Viewer. If nothing else, the book will inspire readers to experiment with the Ngram Viewer themselves and encourage further reading on big data.
Looking at some other reviews, I see complaints that this book focuses too much on the Ngram Viewer and not enough on big data in general. I can see why others thought the title was misleading (it’s metaphorical whereas many people interested in this stuff might be used to explicit journal article titles), and I can see how someone might be misled by the summary on my advanced reading copy, which mentions the Ngram Viewer as well as the implications of big data in general. It looks like the summary on Amazon (and presumably the finished copy) is a bit clearer about the Ngram Viewer being the focus, so hopefully future readers won’t have a similar problem.
I saw "Big Data" in its title, and I just had to grab it off the library shelf. Although it is a light read with slight over 200 pages, certain parts of the book felt pretty boring to me. But maybe it is because I am not really into literature. What I loved about the book was what it drew out from the entire process, from ensuring the issues of copyrights, practicalities of releasing the data, to dealing with the messiness of the data, problems with confounding factors, and how these issues were addressed.
Towards the end, there were noticeably highlights of some findings from this extremely large dataset of millions and millions of digitalise books. I would classify this book as one where I needed to dig through some soil to get to the gold. Re-emphasize is that it was at least a light read.
Big Data isn't just analyzing what's happening right now. It's used to analyze how we have changed over time. The authors use Google's ngram project to get insight into trends in culture through books and words in those books over the last 300 years. One amazing chapter described the measurable impact on culture due to Nazi oppression in the 1940s. Other chapters show how quickly people gain...and then lose fame. Words, too, have a measurable life and death.
Big Data isn't just for geeks. It's for anyone fascinated in history, culture and how we change as a society.
The problem with this book is that it gets boring, quickly. It discusses the creation of Google's N-gram viewer and how it has been used to study history: which would be great, if the insights being generated were unique. But they aren't: they are primarily reflections of what we already know. I wish the book felt more in depth and thoughtful.
Uncharted can be thought of as a case study for a piece of software that demonstrates two emerging intellectual trends: big data and digital humanities. These are explored in the book though the creation of the Ngram Viewer interface for examining the scanned Google Books collection. Digital humanities is an interdisciplinary trend that brings computerized tracking and digital curating tools to fields such as History, Literature, Philosophy, Geography, and Language studies. When the data being examined is itself language, digital humanities overlaps quite nicely with a methodology that has been in place for the past five decades, corpus linguistics. But while corpus linguistics relies on different pieces of specialized concordancing software to gather, count, and track word combinations, Google Ngram Viewer, launched on Dec 2010, is a very accessible way to bring some of these tools to the fingertips of the general public. In this book, Ngram Viewer is deployed as a way to answer questions quick questions about cultural history.
The larger field of DH is introduced in Chapter 7 (Utopia, Dystopia, and Dat(a)topia), which looks at the range of historical records that could be digitized, and also some of the pitfalls of ever-wider access to such records. They note, for example, the spotty coverage of newspaper digitization e.g. “Most of Poe’s newspaper articles have not been digitized, and no one knows when they will be” (p. 172), and the even spottier digitization of the many unpublished formats of writing: manuscripts, letters, wills, etc. It’s worth noting that the problem is not only one of getting data into a digital form. Even some of the born-digital materials that humans now create will have a limited appearance in the historical record, since blog posts, email, web page ads, and caches of digitized recordings and transcripts are only as accessible as the servers that host them.
In focusing on occurrences found in Google Books, the book provides an entry into diachronic changes in word use. The results they show are exciting, but a cautionary note should be sounded. That is, it’s not as simple as looking at an ngram chart to have the story. What words are used is now clearly knowable, but capturing why they are used and identifying the right contexts in which to interpret them are still the necessary next steps of scholarship. Yet the authors sometimes present these as finished tasks. On seeing the first graph of ngram data for the word “evolution”, they note: “drawing from an ocean of data, the curve had distilled a simple powerful story that anyone could understand” (p. 159).
They do, however, acknowledge that as a data source, book publishing is too slow to trace certain faster moving ideas and information (148) i.e., many ideas are more typically discussed in media other than books, e.g. texting, email, TV news, face to face conversation. But this is often overlooked in the book, such as the claims on p. 157 that it’s now possible “to quantify the spirit of the people, the Volksgeist, by empirically measuring aspects of collective consciousness and collective memory.” This enthusiasm leads the authors to coin the name of their approach as “culturomics”: where “the omics denotes big data” and the cultur- evokes the anthropological studies of Franz Boas in being “empirically knowable” (158-9).
Such big picture excitement is indicative of their repeated, but unexamined, premise that the number of written occurrences of a word can be equated with the frequency of the thoughts or experiences it represents: “By seeing how often people talk [in print] about a year, we can get a sense of how present the events of that year are in their minds” (p. 144), “Ngrams tell us about the past. Alas, they do not predict the future. Yet.” p. 157. However on p. 189, they return to the topic of predictions, with the claim that “Ngrams that are going up [in a 20-year period] tend to keep going up. Ngrams that are going down tend to keep going down,” leading the authors to hint at the possibility of “a predictive science of history.”
Some reader-friendly history of science is presented at several points throughout the book, including an amusing discussion of Ebbinghouse’s original experiments on long and short term memory, which make up some of groundwork of the field of psychology (pp.138-141) and a useful introduction to Zipf’s law, explaining normal and non-normal distribution (pp. 28-33).
Several of the cultural incidents chosen as illustrations, however, verge on the melodramatic: “the impact on their lives and careers was immediate and devastating” of the Hollywood Ten (p. 124); “this heartbreaking chart” showing mentions of Tiananmen Square (p. 127); the despondent painter Charlotte Saloman who died in Auschwitz (p. 131); the 9-11 destruction of the World Trade Center, on p. 142; the digital hounding that ended in the 2013 suicide of Rehtaeh Parsons, (p. 181). At the same time, it’s through the discussion of stories of such wide-ranging historical breadth that the authors first mention a very intriguing way to use the diachronic tracking of Ngram Viewer to automate finding gaps in the historical record that could indicate suppressed information.
The final chapter presents a much-needed call for the funding of humanities data collection to equal the level at which science projects are funded, suggesting that we need to “consider the potential impact of a multi-billion-dollar project aimed at recording, preserving, and sharing the most important and fragile tranches of our history to make them widely available for ourselves and our children” (p. 174). Ngram Viewer is put forward as an enticing way of showing what could be found by exploring such data collections. The fun of tracking ngrams is aptly described as “a new and extremely nerdy form of heroin” (p. 162). The book ends with 48 graphs that illustrate this addictiveness, with charts aptly presented in xkcd style drawings.
More about these authors: • Jean-Baptiste Michel’s 2012 TED talk on this topic (called The Mathematics of History).
• Erez Aiden will appear as a keynote speaker at the 1st Inaugural Texas Digital Humanities Conference on Networks in the Humanities on April 10-12, 2014.
This review was written for LibraryThing Early Reviewers.
I love Data. I have been in involved in the Data Science and Big Data scene for a while so I was really pleased to have picked up this book from my local library.
The book talks about analyzing human culture using Google's largest collection of digitized books (over 30 million books). It goes into detail about the fame of some words, slang and how words change using the data as a method of showing how it might of happened. It also goes deep into other methods such as how the Nazi's censored their country by showing you the lack of words that were said in books at the time. The book then gives a bit of a moral dilemma by providing the reader with the direction on big data and how it will effect our lives more later down the line due to Social Media.
The points the authors went over were great and varied. I especially liked seeing the one about censorship during the USSR/Nazi timeframes. Perhaps I am being pessimistic but I wish the authors would of went deeper into some of the subjects they went into. It seemed like they only just brushed the surface at times.
I would recommend this book to anyone. Even if they aren't a techie or if "big data" intimidates them. The authors do a great job at making it for anyone to pick up and get into.
For someone who works in the big data space, it managed to captivate even me.
Pretty much think ancestry.com.. for ideas, tracking the rise and fall of concepts like a gossip magazine tracks celebrity relationships. Want to know when "girlfriend" became more popular than "mistress"? They've got charts for that! Curious about how quickly we went from "thy" to "you're"? They've got the receipts!
The book is like that friend who's always spouting random facts at parties, except this time they've got hard data to back up their claims. It's a delightfully geeky romp through history, using big data as a time machine to peek into how language, culture, and ideas evolved over centuries. Sure, sometimes it feels like watching someone get way too excited about spreadsheets, but their enthusiasm is infectious. Just be warned: after reading this, you might find yourself obsessively Google-trending everything from pizza toppings to philosophical concepts, trying to spot the next big cultural shift.
Perfect for: data nerds, history buffs, and anyone who's ever wondered why we stopped saying "wherefore" and started saying "why."
Not recommended for: people who think Excel is a form of dark magic.
This is a fascinating, interesting, entertaining, and very well written story about language, culture, and big data from the creators of google's ngram viewer. So why only three stars?
The book starts with a discussion of whether a picture is worth a thousand words or a million, and sadly, for a book so taken with the visual representation of data, the pictures here aren't worth the price of admission. Multiple long thin lines on a graph may work well on a computer screen in primary colors (as in google's ngram viewer), but rendered as only vaguely dissimilar shades of gray on the printed page those same thin lines are an exercise in frustration. The authors really should have worked harder to differentiate the lines on their many graphs - with some effort I could puzzle out which line was which, but they should have done the work to make their point more obvious for the reader. These graphs could have easily been a source of inspiration, but as printed were mostly just an annoyance.
It's easy enough to recreate many of the graphs online using the tool google provides if you're near a computer and interested (and you should be interested - the data are really cool). But printing them in the book as they are is just a waste of time and paper, and the authors really should have made the effort to do it better.
The book has an interesting premise: using counting as a way to track the evolution of language. The problem is that all the actual counting is boring, and all of the conclusions are "look at how cool this is" with few attempts to provide an explanation. I thought this book was going to be good when they were describing the phasing out of irregular verb conjugation, but then the other parts were simply charts to answer boring questions.
What I would have liked is a more formal time-based or large-dimensional analysis. Yes, it's cool to see how word frequencies go up and down. But there is no conclusions after the fact. The book is basically summarized by the appendix that dumps charts of word frequencies over time. In fact, the book is basically summarized by playing with the n-gram viewer. So go play with that instead of reading the book.
Really cool concept. Started off really strong. Can we map human history and rather than let our subjective tales provide the narrative- can we use data science and get real evidence. Then they do it a few times by showing relative frequencies of words or spellings (this, sadly, was the most effective use) over time.
Then they pose cool questions and don't answer all of them until an appendix that is grey scale. I couldn't tell which line went with which object in the legends- that was mildly infuriating. Didn't really care about their copyright challenges with google scholar/books or their pitching and developing of their report. Think they were padding to get a book out of what should've been an article. +1 for cool use of data science, history, and graphs. -1 for horrible formatting, personal asides, and length. Probably wouldn't recommend.
What started like a very promising book ended up being a collection of snapshots on specific term comparisons using Google Ngram. It was interesting but a bit shallow as a consequence. This is unfortunate.
What I found fascinating was the part (in the first pages of the book) on the origin of irregular verbs. Anomalies in Zipf's law, they are the relics of the Proto-Indo-European language (6-12,000yrs old - Ablaut grammatical scheme such as ring rang rung, sing sang sung). They have survived into the Proto-Germanic language (500-250 BCE) but they are progressively being wiped out of our modern language. And one could make predictions as to the next irregular verbs to disappear! That's really the main example I will remember from that book. It was also fun, I must admit, to try to reproduce some of the book's plots on Google Ngram.
Before you read this, know that it is outdated and don’t fault it for that, because when you get past phrases like “web start up called Snapchat”, it is truly worth the read. This book is fantastic from start to finish. I had no idea what it was about when I picked it up, and it is not a topic I have ever read about, but it was fascinating. The writing style is great and there were parts that made me laugh out loud. At a non fiction book. About data. If this topic is at all interesting to you, please pick this book up because it’s delightfully nerdy and interesting and just really, really good.
This book is an extension of Aiden's and Michel's article (Michel et al. 2011) that became an instant classic within the quantitative literary criticism world. It provides a fascinating backstory to how the digitization process began (thankfully Google's Larry Page enjoys books quite a bit) and the history of a few key players in QLC (some even before computers existed). It also provides some insight into where QLC might bring us, both the good and the bad. An excellent book for both the quantitative and humanities types.
Take 40 million books, through an analyzer at it, and what do you get? A way to understand society that uses brute force statistics over hearsay and anecdotes. Especially enjoyed the section on the fragility of fame.
Interesting material but jumbled style/organization. The book has too many little stories and ignores established research - like hundreds of years of research by linguists. They come off a bit like know it alls who don’t know quite as much as they think they do.
This book promotes an interesting program which Aiden and Michel helped to develop (the Google Ngram Viewer) and a term they invented (Culturomics - the use of huge amounts of digital information to track changes in language, culture and history), yet I feel they are only touching the surface with the technology they helped to create.
The Ngram Viewer and the use of Culturomics can be useful (software engineer Jeremy Ginsberg observed by researching googling records for a region that a flu epidemic can be quickly identified and can provide an early warning system for that region), yet the examples they give from their own research provide a reaffirmation of something we already know (the words unemployment and inflation are used more during economic depressions) or of something we could care little to know other than as an interesting tidbit ('doughnut' was overtaken by 'donut' soon after the business Dunkin' Donuts began). They state that "digital historical records are making it possible to quantify our human collective as never before" and that their culturomics is a "microscope to measure human culture"...yet the book lacks the deep thinking to reach a worthwhile goal, only interesting "potato chips for intellectuals" as stated in William Grimes NYT review ‘Uncharted,’ by Erez Aiden and Jean-Baptiste Michel. For now, it is mostly increasing awareness of trivial matters. But there is always the future to look forward to...and much more data.
Quote from page 10: 'As we experience all that contemporary life has to offer, as we live out more and more of our lives on the Internet, we've begun to leave an increasingly exhaustive trail of digital bread crumbs: a personal historical record of astonishing breadth and depth.' How much of a trail? page 11: One bit ( binary digit) is like one yes/no question where 1 is yes 0 is no. "...the average person's data footprint... is a little less than one terabyte' or 'about 8 trillion yes-or-no questions. 'Humanity produces five zettabytes each year : 40,000,000,000,000,000,000,000 (forty sextillion) bits.'
That's why it's called big data. The total data footprint is doubling each year! Data records make it possible to reliably transform and manipulate information, so it is clear that this will be a territory that will probably remain mostly uncharted for a long time, or maybe just become a massive wasteland that only a few will care to bother visiting.
This book's concluding chapter focuses on similar future developments (life logging and mind-machine interfaces) that Smarter Than You Think by Clive Thompson covers in more depth and breadth. I would recommend simply checking out their TED talk What we learned from 5 million books ...its a brief account of what is covered in Uncharted, and that is really all the information that is needed . Read Clive Thompson's Smarter Than You Think to gain a deeper insight into the uncharted territory of technology. And read David Egger's The Circle for a fun and insightful fictional look into the future of life logging.
The authors worked with the folks at Google to create an interesting sort of index. They looked in the corpus of some 33 million books that Google had scanned for the Google Books project, and counted the occurrences of all words or short sequences of words, up to five words in length published in a given year. The result came to be the Google Books Ngram viewer, where you can type in a handful of words or phrases in a comma-separated list, and graph their frequency of usage over the years from 1800 to 2008.
Why is this interesting? Well, it gives a certain mathematical exactitude to the popularity of words and phrases over time. For example, type in "telegraph, telephone, radio, television, Internet" and you can graphically see when each technology entered the scene, and the relative stir it created, and compare them to each other (spoiler: "Internet" is a pretty big deal right about now). Or, you can type in the names and use it as a measure of fame (spoiler: the Beatles were never anywhere near as big as Jesus).
The book deals with the details of their technique, some thoughts on the promises of using this kind of "Big Data" for what they call Culturnomics, an attempt to make the "science" of social sciences a bit more than wishful thinking.
Reading the book can be a very interactive experience. I found myself getting up and playing with the Ngram Viewer online, coming up with variations on their examples, or going off on tangents of my own. Since the book was written, Google has added some refinements. You can use wildcards, like "Queen of *", which charts the top 10 ways to finish that phrase. Or you can tag a word to isolate verb or noun forms, so "liason_VERB" can be used to find that particularly dark period of our language when the word was used that way.
All in all, pretty fun. Even if you don't have time to read the book, pull up the Ngram viewer in your browser, and fool around a bit.
Everyone has heard of Big Data; huge amounts of information, usually involving computers or the Internet. Is there a cultural or historical equivalent of Big Data?
Yes, and it comes from Google's intention to digitize all the world's books (or, at least, a significant portion of them). The authors created an algorithm that would search all those books for certain words. On a chart, it will show, for instance, how many times, per million words, the name "Abraham Lincoln" was used, or "World War II." It can also be used to compare the historical use of pairs of words, like Satan/Santa, evolution/DNA, men/women, war/peace, tea/coffee or old school/new school. It can be found at books.google.com/ngrams ("Possibly the greatest time-waster in the history of the Internet." - Mother Jones magazine). Google needed convincing that this was a good idea, that it would not open them up to millions of copyright infringement lawsuits.
Using this algorithm, it is possible to look at things like historical attempts at censorship. It can range from Nazi attempts to remove Jewish artists like Marc Chagall from the German cultural landscape, to the 1950's Hollywood Blacklist. A person can also look at how long a certain word or phrase stays in the cultural memory. For instance, "Korean war" has a huge jump in usage in the 1960's, then an equally huge drop in usage soon after, down to its present level of almost nothing.
The book also looks at the evolution of the English language. If we have pairs of words like drive/drove, what happened to thrive/throve? Also, what happened to words like burnt, learnt and dwelt? It all has to do with irregular verbs, which change over time.
This is a fascinating book, but it will take some effort on the part of the reader. It's very well done, and it gives the reader the chance to do their own historical research.
This book summarizes the PhD theses of the two co-authors, and builds on a research article published in Science in 2011. The project itself is fantastic, as is the Google Books project (at least in terms of the scope of data scanned and generated); however, this book falls a little short in digging in to it.
First, there are no endnotes or footnotes or sidenotes (a la Tufte); there is a chapter of "Notes" at the end, but they aren't easily referenced in the text. Second, every plot is essentially a histogram of word frequency vs time, binned by year, and the y-axis changes between "mentions per billion words" and "mentions per million words" without much note. Third, perhaps over the authors' objections, the book is published in greyscale: with a dataset of nearly six million books, using color as an additional visualization information channel would be quite helpful. This also limits the comparisons (in greyscale) to about two to six topics per plot. Further, no "small multiples" are used, which limits the plots to two per page. This is something which could be addressed with more expensive printing (a la Stephen Few's or Edward Tufte's books and layouts), use of Tableau or other visualization software, or a more time spent on the VISUALIZATION of the enormous data sets analyzed, would bring their analysis to life. Many peaks and dips in their plots also go unexplained, except for the obvious, e.g. 9/11 brought about a spike in mentions of Pearl Harbor.
Overall, the technical effort was monumental and the launch of Google N-grams was undoubtedly a great public service. But this book fell a little short of those standards.