Jump to ratings and reviews
Rate this book

Learning Apache Spark 2

Rate this book
Learning Apache Spark 2: Big Data Analytics & Processing at the Speed of Light for Apache Spark 2.0 Beginners

Delve into the world of Apache Spark 2 and master its intricacies, concepts and architecture with Learning Apache Spark 2. A comprehensive guide to Apache Spark 2 for beginners, this book covers everything you need to know to get up and running with fast data processing, and allows you to easily understand technical aspects via real life examples.

Summary

Learning Apache Spark 2 is the ultimate guide to getting started with the fastest growing open source project for big data analytics. You’ll cover Apache Spark with Python, R, Java & Scala, and get to grips with data exploration and data processing.

About the Technology

As a graph processing system and platform, Apache Spark 2.0 enables big data analytics and efficient data processing. Apache Spark provides key capabilities in different forms, including R and Java. With access to diverse sources and a unified API, it’s easy to see why Apache Spark is the hottest technology for big data analytics.

About the Book

Learning Apache Spark 2 is a superb introduction to Apache Spark 2 for beginners, covering everything you need to know about big data analytics & fast data processing. Learn how to install Apache Spark, write & build your first Spark program, and work through real world examples easily and confidently.

What’s Inside

Overview of Apache Spark architecture & installationUnderstand Spark Streaming & Machine LearningBuild a recommendation systemAbout the Reader

Perfect for those who want to get to grips with Apache Spark for fast and efficient data processing, this book provides a recap of the history of big data, and explains the reasons why Apache Spark is so popular. You’ll need Spark 2.0, which can be downloaded from the Apache Spark website.

About the Author

Muhammad Asif Abbasi has worked in the industry for over 15 years, and is currently a Principal Business Solutions Manager. Muhammad is an Oracle Certified Java EE 5 Enterprise architect, PMP, Hortonworks Hadoop Certified developer and administrator, and holds a Master’s degree in Computer Science & Business Administration.

Table of Contents

Architecture and InstallationTransformations and Actions with Spark RDDsETL with SparkSpark SQLSpark StreamingMachine Learning with SparkGraphXOperating in Clustered ModeBuilding a Recommendation SystemCustomer Churn PredictionThere's More with Spark

356 pages, Kindle Edition

Published June 6, 2017

8 people are currently reading
14 people want to read

About the author

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
1 (25%)
4 stars
3 (75%)
3 stars
0 (0%)
2 stars
0 (0%)
1 star
0 (0%)
Displaying 1 - 2 of 2 reviews
Displaying 1 - 2 of 2 reviews

Can't find what you're looking for?

Get help and learn more about the design.