Learning Apache Spark 2: Big Data Analytics & Processing at the Speed of Light for Apache Spark 2.0 Beginners
Delve into the world of Apache Spark 2 and master its intricacies, concepts and architecture with Learning Apache Spark 2. A comprehensive guide to Apache Spark 2 for beginners, this book covers everything you need to know to get up and running with fast data processing, and allows you to easily understand technical aspects via real life examples.
Summary
Learning Apache Spark 2 is the ultimate guide to getting started with the fastest growing open source project for big data analytics. You’ll cover Apache Spark with Python, R, Java & Scala, and get to grips with data exploration and data processing.
About the Technology
As a graph processing system and platform, Apache Spark 2.0 enables big data analytics and efficient data processing. Apache Spark provides key capabilities in different forms, including R and Java. With access to diverse sources and a unified API, it’s easy to see why Apache Spark is the hottest technology for big data analytics.
About the Book
Learning Apache Spark 2 is a superb introduction to Apache Spark 2 for beginners, covering everything you need to know about big data analytics & fast data processing. Learn how to install Apache Spark, write & build your first Spark program, and work through real world examples easily and confidently.
What’s Inside
Overview of Apache Spark architecture & installationUnderstand Spark Streaming & Machine LearningBuild a recommendation systemAbout the Reader
Perfect for those who want to get to grips with Apache Spark for fast and efficient data processing, this book provides a recap of the history of big data, and explains the reasons why Apache Spark is so popular. You’ll need Spark 2.0, which can be downloaded from the Apache Spark website.
About the Author
Muhammad Asif Abbasi has worked in the industry for over 15 years, and is currently a Principal Business Solutions Manager. Muhammad is an Oracle Certified Java EE 5 Enterprise architect, PMP, Hortonworks Hadoop Certified developer and administrator, and holds a Master’s degree in Computer Science & Business Administration.
Table of Contents
Architecture and InstallationTransformations and Actions with Spark RDDsETL with SparkSpark SQLSpark StreamingMachine Learning with SparkGraphXOperating in Clustered ModeBuilding a Recommendation SystemCustomer Churn PredictionThere's More with Spark