Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications brings together all the information, tools and methods a professional will need to efficiently use text mining applications and statistical analysis. Winner of a 2012 PROSE Award in Computing and Information Sciences from the Association of American Publishers, this book presents a comprehensive how-to reference that shows the user how to conduct text mining and statistically analyze results. In addition to providing an in-depth examination of core text mining and link detection tools, methods and operations, the book examines advanced preprocessing techniques, knowledge representation considerations, and visualization approaches. Finally, the book explores current real-world, mission-critical applications of text mining and link detection using real world example tutorials in such varied fields as corporate, finance, business intelligence, genomics research, and counterterrorism activities. The world contains an unimaginably vast amount of digital information which is getting ever vaster ever more rapidly. This makes it possible to do many things that previously could not be spot business trends, prevent diseases, combat crime and so on. Managed well, the textual data can be used to unlock new sources of economic value, provide fresh insights into science and hold governments to account. As the Internet expands and our natural capacity to process the unstructured text that it contains diminishes, the value of text mining for information retrieval and search will increase dramatically.
This is a 1000 page book and you would expect it to have a lot of information but it really doesn't really teach you much. The chapters are all written by different authors so there's a lot of redundancy and the flow is poor. About 700 pages of it are a bunch of tutorials and examples of text mining, which are poor and haphazardly organized. The tutorials generally use Statistica software so if you don't have it, it is really hard to follow along. In addition, most of the pages in the tutorials are filled with hard-to-read screen captures that aren't very helpful. I admire the effort to include real data examples but it hasn't been carried out very well in this book. The same points keep getting repeated over and over "Text mining is valuable . . . " and "Text mining can do . . . "
I have heard a lot of hype about text mining and its ability to provide valuable insights because of the large amount of text data that is available. Rather than help me understand the hype, this book did the opposite. It has made me more skeptical of the potential value of text mining. Are there some successful applications out there? Absolutely. But this book doesn't really have much of them. It keeps claiming that text mining can be powerful but the examples are so poorly done that it is difficult to see how the analysis actually led to a valuable business result and solved real problems.
A well-structured textbook on basic and more advanced topics in text mining. It does a good job on explaining main definitions, history and trends in the domain. Although a large volume, one can easily pick only the topics of an interest and need.