Items related to PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes

PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes - Softcover

 
9781484243367: PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes

This specific ISBN edition is currently not available.

Synopsis

Chapter 1:  Introduction to PySparkSQL

Chapter Goal: Reader will  understand about PySpark, PySparkSQL , Catalyst Optimizer, Project Tungsten and Hive

No of pages                   20-30

Sub -Topics

1.      PySpark

2.      PySparkSQL

3.      Hive

4.      Catalyst

5.      Project Tungsten

 

Chapter 2:  Some time with Installation

Chapter Goal: Learner will understand about installation of Spark, Hive, PostgreSQL, MySQL, MongoDB, Cassandra etc.

No of pages: 30 -40

Sub - Topics                 

1.       Installation Spark

2.      Installation Hive

3.      Installation MySQL

4.      Installation MongoDB

Chapter 3:  IO in PySparkSQL

Chapter Goal: This chapter will provide recipes to the reader, which will  enable them to create PySparkSQL DataFrame from different sources.

No of pages : 40-50

Sub - Topics:                

1.      Creating DataFrame from data.

2.      Reading csv file to create Dataframe

3.  Reading JSON file to create Dataframe.

4.  Saving  DataFrames to different formats.

 

Chapter 4 :  Operations on PySparkSQL DataFrames

Chapter Goal:               Reader will learn about data filtering, data manuipulation, data descriptive analysis , Dealing with missing value etc

No Of Pages ; 40 -50

1.      Data filtering

2.      Data manipulation

3.      Row and column manipulation

 

Chapter 5 :  Data Merging and Data Aggregation using PySparkSQL

Chapter Goal: Reader will learn about data merging and aggregation using PySparkSQL

1.      Data Merging

2.      Data aggregation

 

Chapter 6: SQL, NoSQL and PySparkSQL

Chapter Goal: Reader will learn to run SQL and HiveQL queries on Dataframe

No of pages: 30-40

Sub - Topics:

1. Running SQL on DataFrame

2. Running HiveQL

 

Chapter 7: Structured Streaming

Chapter Goal:               Reader will understand about structured streaming

No of pages : 30-40

1.      Different type of modes.

2.      Data aggregation in structured streaming

3.      Different type of sources

 

 

 

 

Chapter 8 : Optimizing PySparkSQL

Chapter Goal:               Reader will learn about optimizing PySparkSQL

No Of pages  : 20-30

Optimizing PySparkSQL

 

 

 

Chapter 9 : GraphFrames

Chapter Goal:               Reader will understand about graph data analysis with Graphframes. 

No of pages : 30-40

1. GraphFrame Creat

"synopsis" may belong to another edition of this title.

(No Available Copies)

Search Books:



Create a Want

Can't find the book you're looking for? We'll keep searching for you. If one of our booksellers adds it to AbeBooks, we'll let you know!

Create a Want

Other Popular Editions of the Same Title

9781484243343: PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes

Featured Edition

ISBN 10:  148424334X ISBN 13:  9781484243343
Publisher: Apress, 2019
Softcover