Learn to understand and prepare data using BigQuery to make your data accurate, reliable, and ready for analysis and modeling
Key Features
- Explore data with the BigQuery web UI, bq CLI, and BigQuery API in cloud console using mock dataset
- Optimizing storage and query performance techniques in BigQuery
- Case study on data exploration and preparation for advertising, transportation, and support data
Book Description
Data professionals today encounter challenges including handling large volumes of data, dealing with data silos, and lacking appropriate tools. Datasets often arrive in different conditions and formats, demanding considerable time from analysts, engineers, and scientists to process and uncover insights. The complexity of the data lifecycle often hinders teams and organizations from extracting the desired value from their data.
Data Exploration & Preparation with BigQuery will help in addressing these challenges. The book begins with the basics of BigQuery while covering the fundamentals of data exploration and preparation. The book then progresses to demonstrate how BigQuery can be used for data exploration and preparation and explain the various big data tools available on Google Cloud. The book will further teach you to properly shape your tables for query efficiency and best practices for data preparation. You will also learn about when to use Dataflow, BigQuery, and Dataprep for ETL. The book will give you a walkthrough of various case studies that demonstrate how BigQuery can be used to solve real-world data problems.
By the end of this book, Readers are expected to be able to use SQL to explore and prepare datasets in BigQuery to unlock insights from data.
What you will learn
- Assess the quality of a dataset and learn best practices for data cleansing
- Prepare data for data analysis, visualization, and for machine learning
- Approaches to visualize data in BigQuery
- Apply lessons learned with real-life scenarios and design pattern
- Setup and organize BigQuery resources
- Use SQL and other tools to explore datasets
- Best practices for querying BigQuery datasets
- Understand data preparation tools, techniques, and strategies
Who This Book Is For
This book is for data analysts who want to learn how to explore and prepare data using BigQuery. Readers are expected to have a basic understanding of SQL, reporting, data modeling, and transformations. Moreover, this is an excellent guide for anyone who plans on using BigQuery as a data warehouse to provide insights to their business from large data sets
Table of Contents
- BigQuery and Data Exploration and Preparation Introduction
- BigQuery Design and Organization
- Exploring Data in BigQuery
- Loading and Transforming Data
- Querying BigQuery Data
- Exploring Data with Notebooks
- Exploring and Visualizing Data
- Data Preparation Tools
- Cleansing and Transforming Data
- Data Preparation and Cost Control Best Practices
- End to End Use Case for Advertising Data
- End to End Use Case for Transportation Data
- End to End Use Case for Customer Support Data
- Summary of Key points, Future Directions, Resources
Mike Kahn is a data and infrastructure enthusiast and currently leads a Customer Engineering team at Google Cloud. Prior to Google, Mike spent five years in solution architecture roles and worked in operations and leadership roles in the data center industry. His over 15 years of experience have given him a deep knowledge of data and infrastructure engineering, operations, strategy, and leadership. Mike holds multiple Google Cloud certifications and is a lifelong learner. He is based in Boca Raton, Florida, in the US and holds a Bachelor of Science degree in Management Information Systems (MIS) from University of Central Florida and a Master of Science degree in MIS from Florida International University.