Jump to ratings and reviews
Rate this book

Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem

Rate this book
Get Started Fast with Apache Hadoop® 2, YARN, and Today’s Hadoop Ecosystem

 



With Hadoop 2.x and YARN, Hadoop moves beyond MapReduce to become practical for virtually any type of data processing. Hadoop 2.x and the Data Lake concept represent a radical shift away from conventional approaches to data usage and storage. Hadoop 2.x installations offer unmatched scalability and breakthrough extensibility that supports new and existing Big Data analytics processing methods and models.

 

Hadoop® 2 Quick-Start Guide is the first easy, accessible guide to Apache Hadoop 2.x, YARN, and the modern Hadoop ecosystem. Building on his unsurpassed experience teaching Hadoop and Big Data, author Douglas Eadline covers all the basics you need to know to install and use Hadoop 2 on personal computers or servers, and to navigate the powerful technologies that complement it.

 

Eadline concisely introduces and explains every key Hadoop 2 concept, tool, and service, illustrating each with a simple “beginning-to-end” example and identifying trustworthy, up-to-date resources for learning more.

 

This guide is ideal if you want to learn about Hadoop 2 without getting mired in technical details. Douglas Eadline will bring you up to speed quickly, whether you’re a user, admin, devops specialist, programmer, architect, analyst, or data scientist.

 

Coverage Includes

Understanding what Hadoop 2 and YARN do, and how they improve on Hadoop 1 with MapReduce Understanding Hadoop-based Data Lakes versus RDBMS Data Warehouses Installing Hadoop 2 and core services on Linux machines, virtualized sandboxes, or clusters Exploring the Hadoop Distributed File System (HDFS) Understanding the essentials of MapReduce and YARN application programming Simplifying programming and data movement with Apache Pig, Hive, Sqoop, Flume, Oozie, and HBase Observing application progress, controlling jobs, and managing workflows Managing Hadoop efficiently with Apache Ambari–including recipes for HDFS to NFSv3 gateway, HDFS snapshots, and YARN configuration Learning basic Hadoop 2 troubleshooting, and installing Apache Hue and Apache Spark  

888 pages, Kindle Edition

First published April 1, 2015

18 people are currently reading
91 people want to read

About the author

Doug Eadline

3 books

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
7 (50%)
4 stars
4 (28%)
3 stars
1 (7%)
2 stars
1 (7%)
1 star
1 (7%)
Displaying 1 - 2 of 2 reviews
Profile Image for BCS.
218 reviews33 followers
February 10, 2016
The book concerns the concepts and operation of a Big Data environment using the Apache Hadoop 2 ecosystem. As well as the usual introductory sections it contains 10 major sections and 5 appendices.

The book really does take you from soup to nuts, as they say in the US, starting with an introduction to the concepts and history of Hadoop and Big Data, through installation, file system basics, MapReduce Framework & Programming, Hadoop Tools (including Yarn applications), and finally the management and administration of Hadoop under Apache Ambari. The book also has its own website, complete with code downloads, question & answer forums, resource links and update information.

The guide starts at the very beginning for the complete novice user, taking them through a step by step process to install Hadoop in a single platform environment for a virtual Hadoop sandbox (Hortonworks HDP [Hortonworks Data Platform] Sandbox to be precise) or pseudo distributed mode. The former being available for Microsoft or Apple operating systems.

The latter, while more complex, does more closely resemble a fully operational Hadoop environment. Normally, the Hadoop environment uses a cluster of servers running in a data centre setup, but this Quick Start Guide provides the necessary process to implement Hadoop on a stand-alone desk or laptop for personal use and evaluation. Obviously, this does restrict the size of data involved and the analysis that can be undertaken, but it does also provide an introduction for the individual approaching Big Data for the first time.

In a similar manner the book then takes the reader through the full operation of the Hadoop 2 system with code examples where necessary. All this can therefore be used by either novices or more experienced users using the full blown operational Hadoop environment.

The structure of the book is also linked to the video tutorials, Hadoop Fundamentals: Live Lessons and Apache Hadoop Yarn Fundamentals: Live lessons, also produced by Douglas Eadline and Addison-Wesley, so that the two can be used in conjunction. The author suggests that this may be the best approach for taking on board the subject matter.

In essence there is something in this book for everyone, from some that just want to see what all the Hadoop noise is about, to those that are regular Hadoop users or administrators. The format used is excellent for this type of book, and one that should perhaps set the standard for other ‘quick start’ guides.

The instructions and code examples are easy to follow and provide all the required background. The layout also aids the reader who wants to pick and choose what they read, dependant on their needs at that time, while still providing for the reader who needs to see the whole picture.

Particularly interesting was the section on HDFS (Hadoop Distributed File System) which provides information on the background to the chosen structure for its storage and command environment.

One of the Appendices even gives a summary of the additional resource content in the full sections so that the really high level ‘helicopter’ reader is also served.

Obviously, as the title suggests, there is more detail to be had and I look forward to reading Douglas Eadline’s books at that level as well.

Review by Len Keighley
Originally posted: http://www.bcs.org/content/conWebDoc/...
Displaying 1 - 2 of 2 reviews

Can't find what you're looking for?

Get help and learn more about the design.