Rate this book

Collective Intelligence in Action

Name: Collective Intelligence in Action
Rating: 3.78 (6 reviews)
ISBN: 9781933988313

Satnam Alag

Rate this book

There's a great deal of wisdom in a crowd, but how do you listen to a thousand people talking at once? Identifying the wants, needs, and knowledge of internet users can be like listening to a mob.

In the Web 2.0 era, leveraging the collective power of user contributions, interactions, and feedback is the key to market dominance. A new category of powerful programming techniques lets you discover the patterns, inter-relationships, and individual profiles-the collective intelligence--locked in the data people leave behind as they surf websites, post blogs, and interact with other users.

Collective Intelligence in Action is a hands-on guidebook for implementing collective intelligence concepts using Java. It is the first Java-based book to emphasize the underlying algorithms and technical implementation of vital data gathering and mining techniques like analyzing trends, discovering relationships, and making predictions. It provides a pragmatic approach to personalization by combining content-based analysis with collaborative approaches.

This book is for Java developers implementing Collective Intelligence in real, high-use applications. Following a running example in which you harvest and use information from blogs, you learn to develop software that you can embed in your own applications. The code examples are immediately reusable and give the Java developer a working collective intelligence toolkit.

Along the way, you work with, a number of APIs and open-source toolkits including text analysis and search using Lucene, web-crawling using Nutch, and applying machine learning algorithms using WEKA and the Java Data Mining (JDM) standard.

Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.

GenresComputer ScienceProgrammingTechnologyArtificial Intelligence

425 pages, Paperback

First published August 15, 2008

8 people are currently reading

108 people want to read

About the author

Satnam Alag

1 book

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

11 (18%)

4 stars

27 (46%)

3 stars

17 (29%)

2 stars

2 (3%)

1 star

1 (1%)

Displaying 1 - 6 of 6 reviews

Sopan Shewale

21 reviews3 followers

November 12, 2008

Have you ever surprised to see line similar to "Hello, greatguy. We have recommendations for you" at the top
when you login into the Amazon.com site. Yes, this kind of functionality is very easy to
implement into your application after reading Satnam's "Collective Intelligence in Action"

Have you ever wonder how Netflix is able to recommend movies, what are the latest trends
in the making search more intelligent or how you can intelligently gather new content and
present it to your application?

In this book, Santnam does an excellent job providing the answers to all these questions
The book covers the wide breadth of the topics with amazing focus and detail-architecture
for adding intelligence, tagging and tag clouds, content aggregation through focused web
crawling and from the blogospare, leveraging machine learning techniques such as clustering
and predictive modeling, intelligent search and building recommendation engine.

I particularly like the approach to explain the mathematical concepts with simple examples,
followed by implementing it in simple Java and then leveraging open-source software.

This book can be useful if you are interested in integrating different Open Source Software
to deliver Enterprise Class Application.

I also liked the authors style of providing summary at the end of each chapter.
He also provides huge set of very useful resources for reading further on the topics
covered into the chapters.

You must pickup this book if you are

[1]. serious (developer/manager/architect type of Eng) on adding search or
intelligent/smartness into your Application
[2]. person involved in developing (programmer, tester, manager) Social
Networking Application
[3]. involved in managing "Knowledge Management Infrastructure" of any size organization

This Book will provide you a great foundation for developing Enterprise Class
Features.

I highly recommend it.

reading-done-2008

Glenn

21 reviews1 follower

June 4, 2009

Collective Intelligence is a big topic so the title of this book is very ambitious. The author is too. He sets out to explain all things collective intelligence circa 2009.

Like any good text book, the material is presented in an iterative and incremental way. High level concepts are introduced first. Popular features such as tag clouds and recommendation engines are defined. How collective intelligence manifests in the GUI and screen flow of various popular web properties is illustrated. Various data structures and algorithms are classified. Simple examples of some of the underlying math is worked out. Open source libraries are recommended and coding examples presented. Topics are introduced simply at first and revisited in later chapters with ever increasing amounts of detail.

However, there's a lot of math in the topic of collective intelligence and this is not a math book. Neither does it take a "heads up" or "for dummies" approach. Instead, it aims somewhere in the middle which gives the book a bit of an identity crisis.

If you're a script jock or a web master whose level of technical competence doesn't extend too far past embedding a youtube video or paypal "buy now" button on your pages, then the most that you are going to get out of this book is just the high level introductory material.

The formally trained software engineer can treat this book as a great introductory survey level course on the subject that does attempt to peal back some of the layers of complexity. It is neither definitive nor canonical so you will be doing more research and studying before you deploy any sophisticated collective intelligence in your applications.

Phoenix

482 reviews32 followers

August 6, 2018

11 of 11 people found the following review helpful
How Web 2.0 Works

To really understand this book one would probably have to be a Java programmer, but it is still possible to follow the argumentation, though a background in college level math would be recommended.

The basic idea is that one can catalog documents by removing irrelevant words (adjectives, abstract pronouns, conjunctives) and "stemming" the remaining words (ie: reducing "sews", "sewing", "resew", "sewer" to a root "sew") and creating a vector containing each root word and the word frequency and then normalizing it. One simple result is the ability to produce "word clouds". Similarity between documents is measured by taking the dot product of the two vectors. Any document compared to itself would have a dot product of 1. Two documents with no common stem words would have a dot product of zero. Similar docs would have a high value close to 1, say .8. Dissimilar docs would have a low coefficient, say .15. Even mistaking "sewer" (a conduit for waste) and sewer (one who uses a needle and thread) is taken into account because both docs would only be similar on a couple of keywords, and dissimilar on most others.

What's really neat is how this information gets collected and can be applied. Social networking sites, including the one you are reading right now, Amazon.com, collect data on us through our choices. Browse for a book while logged on then that's something you are interested in. Approve a review the words in the review, summary of the book and the title counts towards your interests. Disapprove and that counts against your interests. Write a review and the words you write become part of your cumulative profile as well, reduced to a vector or vectors of keywords and frequencies.

Here's how it gets applied: One of Amazon's marketing tools is it's "recommendation engine". (The book talks about Netflix recommendation engine and business model). By matching your vector against other people who have bought/viewed what you have bought a prediction can be made as to the likelihood of you being interested in the something that they have bought, or not interested in items that they rejected or disliked. The more Amazon caters to what you are interested in, and doesn't bother you with irrelevancies, the happier you may be.

Other applications discussed include the automatic creation of folksonomies (taxonomies based on popular usage) using cluster analysis and categorization using Bayes theorem.

In addition to recommendation engines Alag points out the usefulness of these techniques to Search and points out several search engines that apply this approach (as does Google), tools that search out and provide news based on your preferences, or suggest "friends" (ie: Facebook or eHarmony might use these ideas), search for similar material to identify copyright infringement, email filters that keep out spam for rolex watches or viagra (unless you are interested in rolex watches or viagra), construct a virus detection engine based on code phrases or early detection of epidemics or adverse reactions to medication through similarities in medical reports. Alag himself appears to be working at a biotech firm NextBio that matches public medical and genome related data to data held by private companies.

Some of the basic tools discussed are Lucene, a free version of what Google will sell you for a search engine, Nutch, a free web crawler, both of which require coding and WEKA, a free open source data mining package that looks usable by the rest of us.

Loved the book and the author's organization of the material. Some of the social implications are scary, especially for privacy concerns, but so is the implication of not leveraging the information that one holds within your organization to provide the best possible service. For example the World Bank has the capability (not necessarily using these methods) to match similar projects around the world so that experience gained in one area can be found and applied elsewhere. This is a key fast moving tech that one needs to understand in order to see where we are going as a society. C.I. in Action is merely the opening salvo - the methods and techniques described are the basics but there is much room for refinement and elaboration and this topic could be the start of a whole new field. The book also recommends and has sparked my interest in the site [...] which is probably more accessible to someone without a math or tech background.

Finally a note to SF fans, this may be the point at which the Web starts to appear to be intelligent. :-)

math-and-science

Fred Tyre

130 reviews5 followers

August 25, 2022

I learned Python (got a great start anyway) from this book. Is it possible to know everything about a programming language? However, I really want to point out how this author was able to break down machine learning in a way that his predecessors always made sound so complicated. Things just made so much more sense after reading this book. Unfortunately, a lot of the APIs in this book no longer work (an unfortunate thing of our time - things aren't made to last these days).

artificial-intelligence technology

AHolly

27 reviews

April 3, 2009

I have just started this book butcan already tell that it contains far more useful information than o'reilly's. Oreilly's seemed a little too, "type this... then type this," and not enough, "here's why you type this," or even, "here's what we are trying to accomplish by typing this."

(will change my rating if I change my mind as I progress further)