Jump to ratings and reviews
Rate this book

Data mining nel social web

Rate this book
Facebook, Twitter e Linkedln generano nel social web un'enorme quantità di dati di grande valore: ma come è possibile individuare le persone che sono in contatto attraverso i social media, scoprire ciò di cui parlano o localizzare le città di appartenenza? Questo libro risponde a queste e ad altre domande. Insegna a fare un uso integrato dei dati del social web, delle tecniche di analisi e dei sistemi di rappresentazione grafica per portare alla luce esattamente ciò che si sta cercando nell'oceano del web, nonché ulteriori informazioni di cui neppure si sospetta l'esistenza.

Ogni capitolo è strutturato come unità a sé e presenta le tecniche per estrarre i dati da aree diverse del social web, compresi i blog e le email. Tutto ciò che occorre per iniziare è una formazione da programmatore e la determinazione a imparare gli strumenti basilari di Python.

321 pages, Paperback

First published January 1, 2011

109 people are currently reading
1038 people want to read

About the author

Matthew A. Russell

18 books10 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
80 (21%)
4 stars
154 (41%)
3 stars
106 (28%)
2 stars
28 (7%)
1 star
7 (1%)
Displaying 1 - 21 of 21 reviews
Profile Image for Doug Lautzenheiser.
30 reviews8 followers
October 13, 2012
This short book might have more appropriately been titled, "How I Personally Mined the Social Web using Python."

Without giving too much explanation, the author provides samples of his Python routines. Where another author might spend an entire chapter (if not the whole book) explaining a technological topic, Russell just makes a comment and moves on to his code examples. If you are comfortable with, "Install this, run that command, and now copy my code..." then this is an okay book.

This is basically a Python cookbook with Social Media recipes. It covers APIs useful for Google e-mail, Twitter, Facebook, and LinkedIn. As such, it was interesting reading to see how it is done, but this is not a primer on how to do it.
Profile Image for Ietrio.
6,936 reviews24 followers
May 31, 2020
A book on data mining. Interesting. Than I open it! Wow!

So data mining starts with why people are on Twitter. A cute bullet list about the human need to be heard. Okay. Maybe it's just a slip. Turn the page over. Well, an insightful paragraph about Twitter having been started at 140 characters! Amazing! Probably your data mining apps won't work if you set the Tweet size to 512 characters!

Next paragraph. More goodies! Twitter is all the rage. Really? Not really. It's interesting for journalists paid by the word as they can put a twist to a badly phrased statement.

I turn the page. And, more data mining information. @HomerSimpson is not a real person! And it does not stop here! Can you imagine it is in relation with the Fox sitcom and not the president of North Korea! Amazing!

So how can I do such high quality data mining? Well, let's see a page about TweetDeck, which does data mining? No, but by this time you already know this book isn't about data mining.

More precious data mining somehow related to this load of manure: the importance of PyDoc. So you have to know Python to follow this clown's examples, but data mining is about PyDoc. Or something.

The data is said to be at Github for free, because the editor and the author make too little money to invest in a domain. But somehow that is lost and the links are obfuscated with bit.ly links to track you. So are you learning about data mining or these unscrupulous characters are data mining your account?
Profile Image for Louis.
228 reviews32 followers
February 17, 2014
The hardest part of learning a data analysis method is not in implementing the method, it is applying the method in the context of a real data problem. And data mining and machine learning texts often skirt the issue by using pre-processed data sets and problems defined to fit the method being taught. Russell uses analysis of social media sites to set a context where you start from having to gain access to real data sets, clean and transform the data into forms that your analytical libraries can make sense of, and then use the results to make a conclusion. For that, it rates a place along any other text that focuses more on the analytical methodology itself.

What I most appreciated about this book was the work put into converting data from one format to another. From the beginning, when he works with data pulled using a services API, then getting that into a format that another library requires, then getting those results into a data mining framework for analysis. Following his flow has helped me understand the methods better. And these examples of processing data from format to format is something that gets my students stuck before they get really started in a project. I especially appreciated the chapters that worked with the Natural Language Toolkit (NLTK) and the NetworkX graph libraries. These examples helped me get pass what was the hard part for me in working with these libraries in previous encounters.

The virtual machine is also very helpful. I have always found the hardest part of working with Python for analytic computing has been teaching my collaborators how to get set up. And in data mining this is even harder than standard. I was able to get through his book installing everything on one machine, but on another I used the author's virtual machine, and I have pointed a student who was working with me to the virtual machine as well.

This is a great book to work through the mess of implementing data mining methods in real situations. It is not a theory book, but it serves its purpose well.

Note: I received a free electronic copy of this book from the O'Reilly Press Blogger program.
Profile Image for Minh Nhật.
92 reviews49 followers
February 17, 2020
đọc nhiều cuốn về chủ đề data science/machine learning thì cuốn này rất rất ổn.

4.5 nhưng đang vui quá tay tí :))

p/s: định mining goodreads thử mà thấy respoone toàn xml nản quá T^T trên này có cả group dev nó bảo api gì toàn tầm chục năm :v. Nghiêm túc thì bạn nào có muốn thử không nhỉ ???
Profile Image for Claire Binkley.
2,242 reviews17 followers
November 18, 2019
What I found most useful from this book was the information these data scientists held within these pages about GitHub. I didn't know anything about this before I opened this book up.

The kawaii icons noting each point to understand in particular are absolutely adorable, as well! I think for a textbook about the new world we're living within today it comes across incredibly nicely.
BUT HONESTLY WHAT MADE ME LAUGH THE HARDEST WAS THE README.1st AT THE BEGINNING - since, I thought to myself, didn't we all click those open in our little games? At least I definitely did!

I remember so clearly~! When I was a little girl like six or seven I would sit with my bird chirping up a storm right on my shoulder as I read the README document the whole way through before I played the cute little text game.
And died a miserable text death, of course.
But the nice thing was that you could start right over again with exactly the same stats!

People asked me "What exactly do they mean by 'mining' in that regard?" and I tell them that you don't typically need your lighted helmet when you're doing this kind of social mining, but it seems to me that it is looking at general trends and making projections for the future.

And, also, look at the adorable woodland creature on the cover!
1 review1 follower
March 13, 2019
Very Helpful

I learned a lot and gathered valuable information. I suggest this book anyone lookin for information on social media and be well informed
1 review
August 7, 2020
About to read hope it will be nice
This entire review has been hidden because of spoilers.
Profile Image for Jeno.
242 reviews74 followers
October 18, 2020
brief overview, some of the things are dated (API suggestions and Jupyter code)
but in general, it is a nice overview
Profile Image for Brad Rice.
150 reviews1 follower
January 21, 2014
I was given a free e-book and asked by O'Reilly to review it in exchange. I was excited for the opportunity since I think that having the ability to mine the social web is important. I was also happy that the author utilized Python as the programming language of choice to show how this is to be done. I have been using Python as a tool now for about a year and have found it to be my preferred server side scripting language for web app development. If you are a php, perl, ruby or java developer, I think you could pretty easily transfer the techniques shown across to your choice platform.

The book is not focused on Python development per-se, but I think a certain amount of knowledge of that language is helpful to understanding the book. One interesting benefit is the appendix, where the author walks you through the use of IPython Notebooks as a way to show example code and execute it. A virtual machine is setup and then run on a port for you to execute live code in a browser.

After getting setup, getting to the heart of the matter, the author does a good job of covering the main aspects of mining the more notable social media sites such as Facebook, Twitter and LinkedIn. Introductions to all the api's and example code showing how to access the data on those sites are well written and explained.

The book is an excellent cookbook and a must have for the technically minded. However, a shortcoming may be that it does not cover much in the way of theory or objectives of data mining and analysis. While this is an excellent book of how-to, the why of social media mining is left to other sources.
Profile Image for Ehnaton.
18 reviews10 followers
December 2, 2014
Я завжди не розумів формату cookbook. Який сенс давати розрізнені куски коду, які виконують те, що треба, але не пояснюють основу. Тут я обламався, тому що від соціалок більше нічого і не треба. Суть - зібрати набір даних соціальних контактів, і знайти патерни, які найчастіше трапляються, і потім красиво їх звізуалізувати. Автор вдається перейти певну межу, і таки навчити зацікавленого прогера копатись в цьому сирому і зашумленому матеріалі, і робити це ефективно. Особливо сподобалась ідея практичного використовувати відстані Жаккара для визначення схожості користувацької аудиторії кількох конкурентів. Клепав весь день код, потім зробив аналіз результатів, і презентував керівництву. Виявилось, що наш продукт буде корисно несподівано розвивати трохи в іншу сторону. От це правильний cookbook. Рекомендую.
Profile Image for Wael Al-alwani.
42 reviews15 followers
November 5, 2011
Excellent book.. its beauty lies in the loads of ideas it gives, efficient ways to implement them, and the tools it talks about. What this book lacks IMHO are the extra detailed discussions on why x approach was followed and what's the rationale behind that.. as one commenter said, the book has too many How's but few Why's.
2 reviews
March 8, 2014
Just started with the book but it looks like an interesting read so far

Just started with the book but it looks like an interesting read so far

The topics for the book cover a wide range of data mining practices and the book seems like a great way to get into Python and Data Science.
Profile Image for Nicolas Morin.
7 reviews1 follower
November 11, 2013
I don't usually enter tech books here, because I rarely read them from cover to cover. Bit this one I did. It's well written, comprehensive. My only caveat is the author's fondness with the heavy VM he uses for his examples...
Profile Image for Gary Lang.
255 reviews36 followers
August 6, 2011
A lot of interesting stuff to play with here. TBD: convert some of this to C#
Profile Image for Aimable Niyikiza.
6 reviews17 followers
March 20, 2014
A great start for anyone interested in data mining. Basic python hacking skills required.
Profile Image for Elena.
39 reviews
April 19, 2015
Ejemplos útiles en python. Bastante bien.
Displaying 1 - 21 of 21 reviews

Can't find what you're looking for?

Get help and learn more about the design.