Jump to ratings and reviews
Rate this book

Learning Big Data with Amazon Elastic MapReduce

Rate this book
Easily learn, build, and execute real-world Big Data solutions using Hadoop and AWS EMRAbout This BookLearn how to solve big data problems using Apache HadoopUse Amazon Elastic MapReduce to create and maintain cluster infrastructure for big data analyticsA step-by-step guide exploring the vast set of services provided by Amazon on the cloudWho This Book Is ForThis book is aimed at developers and system administrators who want to learn about Big Data analysis using Amazon Elastic MapReduce. Basic Java programming knowledge is required. You should be comfortable with using command-line tools. Prior knowledge of AWS, API, and CLI tools is not assumed. Also, no exposure to Hadoop and MapReduce is expected.

What You Will LearnCreate and access your account on AWS and learn about its various servicesLaunch a machine on the cloud infrastructure of AWS, get login credentials, and communicate with that machineLearn about the logical dataflow of MapReduce and how it uses distributed computing effectivelyUnderstand the benefits of EMR over a local Hadoop clusterDiscover the best practices that should be kept in mind while planning and executing a cluster/job on EMRLaunch a cluster on Amazon EMR, submit the Hello World wordcount job for processing, and download and view the resultsExecute jobs on EMR using the two primary methods provided by EMRIn DetailAmazon Elastic MapReduce is a web service used to process and store vast amount of data, and it is one of the largest Hadoop operators in the world. With the increase in the amount of data generated and collected by many businesses and the arrival of cost-effective cloud-based solutions for distributed computing, the feasibility to crunch large amounts of data to get deep insights within a short span of time has increased greatly.

This book will get you started with AWS so that you can quickly create your own account and explore the services provided, many of which you might be delighted to use. This book covers the architectural details of the MapReduce framework, Apache Hadoop, various job models on EMR, how to manage clusters on EMR, and the command-line tools available with EMR. Each chapter builds on the knowledge of the previous one, leading to the final chapter where you will learn about solving a real-world use case using Apache Hadoop and EMR. This book will, therefore, get you up and running with major Big Data technologies quickly and efficiently.

242 pages, Kindle Edition

First published October 30, 2014

2 people are currently reading
3 people want to read

About the author

Amarkant Singh

2 books1 follower

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
3 (60%)
4 stars
2 (40%)
3 stars
0 (0%)
2 stars
0 (0%)
1 star
0 (0%)
Displaying 1 - 2 of 2 reviews
2 reviews
March 23, 2015
This book provides a comprehensive introduction to what and how-to's of AWS. Also covering the basics to advanced methodologies of hadoop along with some much useful examples which makes it an excellent starting point for beginners who wish to learn and gain some useful insight into big data and also AWS EMR. The programming part of Hadoop on AWS includes examples of niche EMR use cases and covering the best practices. The learning curve is huge given this domain is constantly on move. This book is an excellent material for beginners who have little to no knowledge on Elastic MapReduce.
Displaying 1 - 2 of 2 reviews

Can't find what you're looking for?

Get help and learn more about the design.