“This book is a critically needed resource for the newly released Apache Hadoop 2.0, highlighting YARN as the significant breakthrough that broadens Hadoop beyond the MapReduce paradigm.” —From the Foreword by Raymie Stata, CEO of Altiscale The Insider’s Guide to Building Distributed, Big Data Applications with Apache Hadoop™ YARN Apache Hadoop is helping drive the Big Data revolution. Now, its data processing has been completely Apache Hadoop YARN provides resource management at data center scale and easier ways to create distributed applications that process petabytes of data. And now in Apache Hadoop™ YARN, two Hadoop technical leaders show you how to develop new applications and adapt existing code to fully leverage these revolutionary advances. YARN project founder Arun Murthy and project lead Vinod Kumar Vavilapalli demonstrate how YARN increases scalability and cluster utilization, enables new programming models and services, and opens new options beyond Java and batch processing. They walk you through the entire YARN project lifecycle, from installation through deployment. You’ll find many examples drawn from the authors’ cutting-edge experience—first as Hadoop’s earliest developers and implementers at Yahoo! and now as Hortonworks developers moving the platform forward and helping customers succeed with it. Coverage includes
I really, really like the new data analytics book series by Addison-Wesley - it's the 2nd book in the series and it's even better than the first one. Why?
1.) Because it's the first real book about YARN. 2.) Because it's very detailed and comprehensive when it comes to the architecture and how stuff works "under the hood". 3.) Because there's no bloat and watering down the topic.
I've worked with YARN for few months, but now I can say that I was missing a lot of depth in the platform, merely focusing on the Map-Reduce component mainly inherited from Hadoop 1.x. This book helped me to understand this.
Any cons? Yes, some:
1.) Sometimes, the descriptions got very wordy, but they missed better formatting, pictures / drawings or even basic tables. A picture is worth thousand words - sometimes there were just too few of them. 2.) Code, code, code, code - there were a lot of nice descriptions, but only few code examples: why? 3.) The actual book takes only about 70% of the book - everything after 75% is indexes & useless appendixes. Not cool.