This textbook explores the different aspects of data mining from the fundamentals to the complex data types and their applications, capturing the wide diversity of problem domains for data mining issues. It goes beyond the traditional focus on data mining problems to introduce advanced data types such as text, time series, discrete sequences, spatial data, graph data, and social networks. Until now, no single book has addressed all these topics in a comprehensive and integrated way. The chapters of this book fall into one of three Application These chapters study important applications such as stream mining, Web mining, ranking, recommendations, social networks, and privacy preservation. The domain chapters also have an applied flavor. Appropriate for both introductory and advanced data mining courses, Data The Textbook balances mathematical details and intuition. It contains the necessary mathematical details for professors and researchers, but it is presented in a simple and intuitive style to improve accessibility for students and industrial practitioners (including those with a limited mathematical background). Numerous illustrations, examples, and exercises are included, with an emphasis on semantically interpretable examples. Praise for Data The Textbook - “As I read through this book, I have already decided to use it in my classes. This is a book written by an outstanding researcher who has made fundamental contributions to data mining, in a way that is both accessible and up to date. The book is complete with theory and practical use cases. It’s a must-have for students and professors alike!" -- Qiang Yang, Chair of Computer Science and Engineering at Hong Kong University of Science and Technology "This is the most amazing and comprehensive text book on data mining. It covers not only the fundamental problems, such as clustering, classification, outliers and frequent patterns, and different data types, including text, time series, sequences, spatial data and graphs, but also various applications, such as recommenders, Web, social network and privacy. It is a great book for graduate students and researchers as well as practitioners." -- Philip S. Yu, UIC Distinguished Professor and Wexler Chair in Information Technology at University of Illinois at Chicago
Aggarwal deserves to be better known. This is an excellent survey of analytics and data mining models; it's unsuitable as a first book on the topic, but would be excellent as a 3rd or 4th. The great strength is the organized taxonomy in which techniques and subcomponents of techniques are presented. This also leads to some idiosyncracies, such as calling linear regression an advanced special case of classification, which is to put it mildly a minority view.
The strongest parts of the book are his descriptions of common components, such as distance functions, or meta algorithms; knowing about these concepts is often enough to implement them. The weakest parts are the description of individual algorithms, which are often too terse to follow unless you already understand what the algorithm or a related one does.
The best way to use this book is to first read one of the classics (e.g. Hastie and Tibshirani), and then read this book to fill in more options for how to alter and combine them. It also has significant coverage on association mining (i.e. recommender systems) which tend not to be treated elsewhere.