Why doesn't your home page appear on the first page of search results, even when you query your own name? How do other web pages always appear at the top? What creates these powerful rankings? And how? The first book ever about the science of web page rankings, Google's PageRank and Beyond supplies the answers to these and other questions and more. The book serves two very different the curious science reader and the technical computational reader. The chapters build in mathematical sophistication, so that the first five are accessible to the general academic reader. While other chapters are much more mathematical in nature, each one contains something for both audiences. For example, the authors include entertaining asides such as how search engines make money and how the Great Firewall of China influences research. The book includes an extensive background chapter designed to help readers learn more about the mathematics of search engines, and it contains several MATLAB codes and links to sample web data sets. The philosophy throughout is to encourage readers to experiment with the ideas and algorithms in the text. Any business seriously interested in improving its rankings in the major search engines can benefit from the clear examples, sample code, and list of resources provided.
Many illustrative examples and entertaining asides MATLAB code Accessible and informal style Complete and self-contained section for mathematics review
I really love the way they wrote this book. It is not quite a layman text like Stephen Hawking and not quite a textbook. It has just the right balance of detail (akin to a Chakrabarti lite) to be nutritional, and fun historical commentary to be entertaining. The final chapters are fairly math heavy, but it is stuff most of us should have learned in junior high anyways. Props to Larry and Sergey for inventing the algo, but these guys are really the first to formalize and analyze it properly.
I think for me, this would have been a better book if at least one of the authors had been a computer scientist or engineer instead of a mathematician (and it would have been even better if they included an author with some background in the social sciences, although then it would have been a different book). They give an exhaustive description of Pagerank as a linear algebra problem, peppered occasionally with what they conceive of as interesting facts about Google and search, but the two parts don't really fit together. And they have no apparent interest what it actually means to define web pages as "popular" or "important" by their link structure. (And code samples in MATLAB? Not the best way to actually illustrate an algorithm.)
And with usual with these sorts of thing, the mathematics chapter at the end is so terse that it is only of use if you knew the stuff at one time, but had gotten a bit rusty.
This book is a bit of a historical artifact, littered with names of long-dead search engines, overly concerned with what fits in main memory, innocent of map-reduce or GPU processing (although there is a bare mention of parallel processing), and only barely aware of issues raised by search engine dominance.
The first few chapters of this book are a flawed attempt to explain some basic Library and Information Science theory by a couple of arrogant mathematicians. In their arrogance, they get some key points wrong (they completely misunderstand the nuanced relationships among relevance, recall, and precision), fail to credit librarians for things we invented long ago (HITS is just a big citation index), and attempt to minimize the importance of libraries in our past, present, and future. Having said that, though, their explanation of how search engines work (what this book is supposed to be about after all) is pretty good. After the third or fourth chapter it just degenerates into a math textbook.
Doing research for a paper I came across this book. I'm now looking forward to my plane trip so I can read this on the way. June/July I gave it a go on my own but the math is way above my head. This is now on the "to-read with hubby" book shelf.
Google's PageRank is the world's largest linear algebra computation. The authors, who are both professors of mathematics unaffiliated with Google Inc., describe, what it is and how it might work (its exact working, combating spammers and link farmers, is of course a trade secret).
Wonderful book, which is simple enough to grasp everything from the operational work flow of search engines to underlying mathematics behind the PageRank algorithm.