Jump to ratings and reviews
Rate this book

Data Mining: Concepts and Techniques

Rate this book
All our books are brand new. We ship worldwide

744 pages, Hardcover

First published August 1, 2000

113 people are currently reading
648 people want to read

About the author

Jiawei Han

35 books4 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
123 (29%)
4 stars
163 (39%)
3 stars
96 (23%)
2 stars
26 (6%)
1 star
5 (1%)
Displaying 1 - 22 of 22 reviews
Profile Image for Radek Lát.
20 reviews1 follower
January 17, 2015
A good collection of data mining techniques. However, for actual implementation of the presented algorithms you might need to look somewhere else because the presented information is not always clear and the examples are often difficult to transform to your own problems.
Profile Image for Austin.
13 reviews3 followers
January 18, 2014
Jiawei Han was my professor for Data Mining at U of I, he knows a ton and is one of the most cited professors (if not the most) in the Data Mining field. I felt this book reflects that, honestly, his book explains many of the concepts of Data Mining in a more efficient and direct manner than he can in a class setting.

I enjoyed reading his book and learned a lot and there is a reason this is the standard Data Mining book for graduate studies, I would recommend it to anyone wishing to learn Data Mining.
Profile Image for Parisa.
21 reviews4 followers
March 30, 2022
This book is suitable for both beginners and intermediate learners. I enjoyed reading this book immensely.
Profile Image for Emmi.
132 reviews
August 16, 2017
Finished reading important area from this book. It gives clear knowledge in data mining techniques.
Profile Image for eri b.❀.
472 reviews40 followers
June 22, 2022
Very good intro to data mining concepts, well explained and easy to understand. The math was a bit difficult for me though, but you can always come back to it if you ever need the basics.
Profile Image for Thomson Kneeland.
44 reviews3 followers
March 17, 2019
Good overview of Data Science techniques and some algorithms.

3 Stars because some computer scientists need to learn Set theory properly. There's no legitimate reason to exchange the symbols of Union and Intersection in a textbook. Mathematics has a well defined pedagogy and history, and with something as basic as a Venn Diagram, the CS field should actually use accepted terminologies. And I am surprised a professional editor would let this pass.

Should we also exchange the functional operations of addition and subtraction...just for data mining? No. Reading through algorithms where the Union symbol means "Intersection" is just a serious impediment to learning for any student of mathematics.

Every Automata Theory textbook I've read defines these symbols properly. No one I've read exchanges Union and Intersection symbols when proving a language is regular.

There's a predefined history of common operations...use them.
65 reviews1 follower
October 5, 2020
Good for those who want to get a high level knowledge about data mining in general. As a software engineer I found it beneficial to learn new techniques about data mining phases in order to reach knowledge discovery.
Profile Image for رائد الغامدي.
Author 4 books21 followers
December 18, 2015
من الكتب المرجعية الأساسية في موضوع البحث وتنظيم البيانات الكبيرة، الجميل فيه أنه يبدأ بترتيب يسهل على غير المتخصص فهم الموضوع من بدايته من خلال التعريفات والمصطلحات والقواعد الأساسية.
Profile Image for Fabio.
144 reviews6 followers
January 17, 2019
I'm biased because I took the class with the author, professor Han, so I had more time to digest all the math in it, but I find it an extremely useful coverage of the field.
Profile Image for Soma Boubou.
14 reviews
November 16, 2013
First of all, I would like to mention that I am not familiar with data mining and its technology So you can take my review as a summary of the book with my personal opinion -not a professional one- when it is needed.

Now, I'm reading:

**UNIT6 Mining Frequent Patterns,dealing with finding all frequent itemsets and generate strong associating rules.

Every rule holds in transaction set D with Support s and Confidence c:
Support = probability of two items A and B are chosen together.
Confidence = Probability of B in the transaction set D which contains A.

Apriori Algorithm is the fundamental theory to find Frequent Itemsets by confined candidate generation (It is time consuming) P:248.

Improving the efficiency of Apriori can be done using different variations: P:255-256

- Hash-based Techniques.
- Transaction reduction.
- Partitioning.
- Sampling.
- Dynamic itemset counting.

* Frequent Pattern Growth (FP-growth)method for finding frequent itemsets without costly candidate generation process.P:257.

* Using Vertical Data format (personally didn't find it interesting)

* Mining Closed and Max Patterns: This requires us to prune the search space as soon as possible using one of the next strategies:
- Item merging.
- Sub-itemset pruning.
- Item skipping.

6.3 (P:264) is presenting important idea that (Support and Confidence) are not enough and could result in a mislead "strong" association rules.For that reason we are suggested to use correlation measures:
- Lift (A,B)=1 means A and B are independent.
Lift (A,B)<1 means A and B are negatively correlated. Lift (A,B)>1 means A and B are positively correlated.

- X^2 (squared difference between observed and expected values,divided by expected value).

Other patterns evaluation measures which gain interests lately are:
all_confidence, max_confidence, Kulczynski and cosine.measure value(0~1), The higher the value,the closer the relationship between A and B.

all_conf(A,B) = min{P(A|B),P(B|A)}
max_conf(A,B) = max{P(A|B),P(B|A)}
Kulc(A,B)= 1/2 {P(A|B)+P(B|A)}
cosine(A,B)= square{P(A|B)*P(B|A)}

Previous six measures were examined on six typical data-sets (Page:269)
Lift and X^2 are strongly influenced by the number of null-transaction.

null-invariant measure if its value is free from the influence of null-transactions.

Imbalance Ratio(IR): which assesses the imbalance of two itemsets, A and B, in rule implications.

IR(A,B)= |sup(A)-sup(b)|/(sup(A)+sup(B)-sup(A&B)) [0~...]

With imbalance data with confusing values of the latest fore measures, we use the two measures together IR and Kulc.

26 reviews10 followers
June 2, 2012
I selected this book, hoping to understand the difference between Data Mining, which I wasn't familiar with yet, and the fields already known to me of Machine Learning and Statistics. This book provides very good overview of Data Mining techniques in general and it is also packed with lots of practical examples, giving good intuition on what actually Data Mining is and how it is related to Machine Learning and Statistics.
Profile Image for 异次元骇客.
9 reviews
July 3, 2016
I read the translated Chinese version, not the original English version. I don't really like the Chinese version of this book. For some serious abstract names and concepts, there are lots of weird translations. This is my personal opinion.
I guess the English version may be easier to read, especially when it is the concepts that are mainly concerned.
Profile Image for Ayman Sieny.
13 reviews3 followers
February 4, 2011
Another good book on data mining. Explains data mining algorithms and provides examples of their usage. The book is used as a text book for Master's level studies in computer science.
Profile Image for Ohud Saud.
93 reviews4 followers
April 21, 2015
WOW

I have read three/four books in data mining and took two classes and attend a conference. This is the best beginning for you to learn data mining basics and everything related to data analysis.
63 reviews14 followers
August 17, 2014
Read for Data Mining course. Well written and easy to follow with good examples.
Profile Image for R.
258 reviews18 followers
November 30, 2016
The book has simplistic language and is very easy to understand.
Very good from a student's perspective.
The Diagrams are easy to understand and explain the concept thoroughly.
Displaying 1 - 22 of 22 reviews

Can't find what you're looking for?

Get help and learn more about the design.