Hands-On Big Data Analytics with PySpark (Paperback)
Rudy Lai
Sold by Grand Eagle Retail, Bensenville, IL, U.S.A.
AbeBooks Seller since 12 October 2005
New - Soft cover
Condition: New
Ships within U.S.A.
Quantity: 1 available
Add to basketSold by Grand Eagle Retail, Bensenville, IL, U.S.A.
AbeBooks Seller since 12 October 2005
Condition: New
Quantity: 1 available
Add to basketPaperback. Use PySpark to easily crush messy data at-scale and discover proven techniques to create testable, immutable, and easily parallelizable Spark jobsKey FeaturesWork with large amounts of agile data using distributed datasets and in-memory cachingSource data from all popular data hosting platforms, such as HDFS, Hive, JSON, and S3Employ the easy-to-use PySpark API to deploy big data Analytics for productionBook DescriptionApache Spark is an open source parallel-processing framework that has been around for quite some time now. One of the many uses of Apache Spark is for data analytics applications across clustered computers. In this book, you will not only learn how to use Spark and the Python API to create high-performance analytics with big data, but also discover techniques for testing, immunizing, and parallelizing Spark jobs.You will learn how to source data from all popular data hosting platforms, including HDFS, Hive, JSON, and S3, and deal with large datasets with PySpark to gain practical big data experience. This book will help you work on prototypes on local machines and subsequently go on to handle messy data in production and at scale. This book covers installing and setting up PySpark, RDD operations, big data cleaning and wrangling, and aggregating and summarizing data into useful reports. You will also learn how to implement some practical and proven techniques to improve certain aspects of programming and administration in Apache Spark.By the end of the book, you will be able to build big data analytical solutions using the various PySpark offerings and also optimize them effectively.What you will learnGet practical big data experience while working on messy datasetsAnalyze patterns with Spark SQL to improve your business intelligenceUse PySpark's interactive shell to speed up development timeCreate highly concurrent Spark programs by leveraging immutabilityDiscover ways to avoid the most expensive operation in the Spark API: the shuffle operationRe-design your jobs to use reduceByKey instead of groupByCreate robust processing pipelines by testing Apache Spark jobsWho this book is forThis book is for developers, data scientists, business analysts, or anyone who needs to reliably analyze large amounts of large-scale, real-world data. Whether you're tasked with creating your company's business intelligence function or creating great data platforms for your machine learning models, or are looking to use code to magnify the impact of your business, this book is for you. In this book, you'll learn to implement some practical and proven techniques to improve aspects of programming and administration in Apache Spark. Techniques are demonstrated using practical examples and best practices. You will also learn how to use Spark and its Python API to create performant analytics with large-scale data. Shipping may be from multiple locations in the US or from the UK, depending on stock availability.
Seller Inventory # 9781838644130
Use PySpark to easily crush messy data at-scale and discover proven techniques to create testable, immutable, and easily parallelizable Spark jobs
Key Features:
- Work with large amounts of agile data using distributed datasets and in-memory caching
- Source data from all popular data hosting platforms, such as HDFS, Hive, JSON, and S3
- Employ the easy-to-use PySpark API to deploy big data Analytics for production
Book Description:
Apache Spark is an open source parallel-processing framework that has been around for quite some time now. One of the many uses of Apache Spark is for data analytics applications across clustered computers. In this book, you will not only learn how to use Spark and the Python API to create high-performance analytics with big data, but also discover techniques for testing, immunizing, and parallelizing Spark jobs.
You will learn how to source data from all popular data hosting platforms, including HDFS, Hive, JSON, and S3, and deal with large datasets with PySpark to gain practical big data experience. This book will help you work on prototypes on local machines and subsequently go on to handle messy data in production and at scale. This book covers installing and setting up PySpark, RDD operations, big data cleaning and wrangling, and aggregating and summarizing data into useful reports. You will also learn how to implement some practical and proven techniques to improve certain aspects of programming and administration in Apache Spark.
By the end of the book, you will be able to build big data analytical solutions using the various PySpark offerings and also optimize them effectively.
What You Will Learn:
- Get practical big data experience while working on messy datasets
- Analyze patterns with Spark SQL to improve your business intelligence
- Use PySpark s interactive shell to speed up development time
- Create highly concurrent Spark programs by leveraging immutability
- Discover ways to avoid the most expensive operation in the Spark API: the shuffle operation
- Re-design your jobs to use reduceByKey instead of groupBy
- Create robust processing pipelines by testing Apache Spark jobs
Who this book is for:
This book is for developers, data scientists, business analysts, or anyone who needs to reliably analyze large amounts of large-scale, real-world data. Whether you're tasked with creating your company's business intelligence function or creating great data platforms for your machine learning models, or are looking to use code to magnify the impact of your business, this book is for you.
"About this title" may belong to another edition of this title.
We guarantee the condition of every book as it¿s described on the Abebooks web sites. If you¿ve changed
your mind about a book that you¿ve ordered, please use the Ask bookseller a question link to contact us
and we¿ll respond within 2 business days.
Books ship from California and Michigan.
If you are a consumer you can cancel the contract in accordance with the following. Consumer means any natural person who is acting for purposes which are outside his trade, business, craft or profession.
INFORMATION REGARDING THE RIGHT OF CANCELLATION
Statutory Right to cancel
You have the right to cancel this contract within 14 days without giving any reason.
The cancellation period will expire after 14 days from the day on which you acquire, or a third party other than the carrier and indicated by you acquires, physical possession of the the last good or the last lot or piece.
To exercise the right to cancel, you must inform us, Grand Eagle Retail, 26C Trolley Square, 19806, Wilmington, Delaware, U.S.A., 1 (302) 261-2674, of your decision to cancel this contract by a clear statement (e.g. a letter sent by post, fax or e-mail). You may use the attached model cancellation form, but it is not obligatory. You can also electronically fill in and submit a clear statement on our website, under "My Purchases" in "My Account". If you use this option, we will communicate to you an acknowledgement of receipt of such a cancellation on a durable medium (e.g. by e-mail) without delay.
To meet the cancellation deadline, it is sufficient for you to send your communication concerning your exercise of the right to cancel before the cancellation period has expired.
Effects of cancellation
If you cancel this contract, we will reimburse to you all payments received from you, including the costs of delivery (except for the supplementary costs arising if you chose a type of delivery other than the least expensive type of standard delivery offered by us).
We may make a deduction from the reimbursement for loss in value of any goods supplied, if the loss is the result of unnecessary handling by you.
We will make the reimbursement without undue delay, and not later than 14 days after the day on which we are informed about your decision to cancel with contract.
We will make the reimbursement using the same means of payment as you used for the initial transaction, unless you have expressly agreed otherwise; in any event, you will not incur any fees as a result of such reimbursement.
We may withhold reimbursement until we have received the goods back or you have supplied evidence of having sent back the goods, whichever is the earliest.
You shall send back the goods or hand them over to us or Grand Eagle Retail, Grand Eagle Retail c/o Kable Product Services, 4275 Thunderbird Lane, 45014-45, Fairfield, Ohio, U.S.A., 1 (302) 261-2674, without undue delay and in any event not later than 14 days from the day on which you communicate your cancellation from this contract to us. The deadline is met if you send back the goods before the period of 14 days has expired. You will have to bear the direct cost of returning the goods. You are only liable for any diminished value of the goods resulting from the handling other than what is necessary to establish the nature, characteristics and functioning of the goods.
Exceptions to the right of cancellation
The right of cancellation does not apply to:
Model withdrawal form
(complete and return this form only if you wish to withdraw from the contract)
To: (Grand Eagle Retail, 26C Trolley Square, 19806, Wilmington, Delaware, U.S.A., 1 (302) 261-2674)
I/We (*) hereby give notice that I/We (*) withdraw from my/our (*) contract of sale of the following goods (*)/for the provision of the following goods (*)/for the provision of the following service (*),
Ordered on (*)/received on (*)
Name of consumer(s)
Address of consumer(s)
Signature of consumer(s) (only if this form is notified on paper)
Date
* Delete as appropriate.
Orders usually ship within 2 business days. All books within the US ship free of charge. Delivery is 4-14 business days anywhere in the United States.
Books ship from California and Michigan.
If your book order is heavy or oversized, we may contact you to let you know extra shipping is required.
| Order quantity | 6 to 16 business days | 6 to 14 business days |
|---|---|---|
| First item | £ 0.00 | £ 0.00 |
Delivery times are set by sellers and vary by carrier and location. Orders passing through Customs may face delays and buyers are responsible for any associated duties or fees. Sellers may contact you regarding additional charges to cover any increased costs to ship your items.