This ebook provides a quick summary of essential concepts in Big Data and Hadoop by following snack sized Introduction to Big • Introduction to Big Data • Sources of Big Data • Big Data Characteristics • Big Data Analytics • The Importance of Big Data Big Data in the • Big Data in the Enterprise • Data Processing and The Old Way • Big Data Enterprise Model • Building a Big Data Platform Hadoop and Hadoop • Hadoop • Why Hadoop? • How does Hadoop help? • Hadoop Infrastructure • How Data Model is Different? • How Computing Model is Different? • Hadoop framework Hadoop Distributed File System (HDFS): • Hadoop Distributed File System • Files and Blocks • Replication • A master-slave architecture • Data Placement and Replication • MapReduce • Typical large –data problem • MapReduce Paradigm - I • Word count example • MapReduce Paradigm – II • MapReduce – Jobs and tasks • A master-slave architecture • MapReduce Programming Model • MapReduce – word count mapper • MapReduce – word count reducer • MapReduce – word count main • MapReduce – running a job Relationship between MapReduce and • Relationship between MapReduce and HDFS • Clients, Data Nodes, and HDFS Storage • MapReduce workloads • Hadoop Fault Tolerance • Reading/Writing Files Hadoop and • Hadoop and Databases • Typical Datacenter Architecture • Adding Hadoop to the Mix • The Key Benefit • Complex Data Processing The Hadoop • Job Execution • Hadoop Data Types • Job Configurations • Input and Output Formats Scenarios for Using Hadoop and Hadoop Live Use • Scenarios for Using Hadoop • Major Online Travel Booking Service • Major National Bank • Leading North American Retailer • Netflix
This book was not really value for money. After paying 0.99, it still feels like I'm entitled to receive back the change. The book is likely accurate but so heavily loaded with terminology that it is difficult to follow. The author assume a lot of IT knowledge from the reader. So, eventhough I wanted to understand Hadoop and mapreduce I don't feel I gained much knowledge. But I guess any gain for 0.99 might be worth it. Hopefully it allows me to pick a better book on the topic next time.
I don’t know what this silly little book even intends to accomplish. A collection of bullet points hastily thrown together conveying very little. Hadoop is a very sophisticated technology - distributed processing of large distributed data sets. I knew that already. After reading this book I know not a jot more. Don’t waste your time.