Chapter 1: Introduction to PySparkSQL Chapter Reader will understand about PySpark, PySparkSQL , Catalyst Optimizer, Project Tungsten and Hive No of pages 20-30 Sub -Topics 1. PySpark 2. PySparkSQL 3. Hive 4. Catalyst 5. Project Tungsten Chapter 2: Some time with Installation Chapter Learner will understand about installation of Spark, Hive, PostgreSQL, MySQL, MongoDB, Cassandra etc. No of 30 -40 Sub - Topics 1. Installation Spark 2. Installation Hive 3. Installation MySQL 4. Installation MongoDB Chapter 3: IO in PySparkSQL Chapter This chapter will provide recipes to the reader, which will enable them to create PySparkSQL DataFrame from different sources. No of pages : 40-50 Sub - 1. Creating DataFrame from data. 2. Reading csv file to create Dataframe 3. Reading JSON file to create Dataframe. 4. Saving DataFrames to different formats. Chapter 4 : Operations on PySparkSQL DataFrames Chapter Reader will learn about data filtering, data manuipulation, data descriptive analysis , Dealing with missing value etc No Of Pages ; 40 -50 1. Data filtering 2. Data manipulation 3. Row and column manipulation Chapter 5 : Data Merging and Data Aggregation using PySparkSQL Chapter Reader will learn about data merging and aggregation using PySparkSQL 1. Data Merging 2. Data aggregation Chapter 6: SQL, NoSQL and PySparkSQL Chapter Reader will learn to run SQL and HiveQL queries on Dataframe No of pages : 30-40 Sub - 1. Running SQL on DataFrame 2. Running HiveQL Chapter 7: Structured Streaming Chapter Reader will understand about structured streaming No of pages : 30-40 1. Different type of modes. 2. Data aggregation in structured streaming 3. Different type of sources Chapter 8 : Optimizing PySparkSQL Chapter Reader will learn about optimizing PySparkSQL No Of pages : 20-30 Optimizing PySparkSQL Chapter 9 : GraphFrames Chapter Reader will understand about graph data analysis with Graphframes. No of pages : 30-40 1. GraphFrame Creat