Synopsis:
This book provides a solid practical guidance to summarize, visualize and interpret the most important information in a large multivariate data sets, using principal component analysis methods (PCMs) in R. The visualization is based on the factoextra R package that we developed for creating easily beautiful ggplot2-based graphs from the output of PCMs. This book contains 4 parts. Part I provides a quick introduction to R and presents the key features of FactoMineR and factoextra. Part II describes classical principal component methods to analyze data sets containing, predominantly, either continuous or categorical variables. These methods include: Principal Component Analysis (PCA, for continuous variables), simple correspondence analysis (CA, for large contingency tables formed by two categorical variables) and Multiple CA (MCA, for a data set with more than 2 categorical variables). In part III, you'll learn advanced methods for analyzing a data set containing a mix of variables (continuous and categorical) structured or not into groups: Factor Analysis of Mixed Data (FAMD) and Multiple Factor Analysis (MFA). Part IV covers hierarchical clustering on principal components (HCPC), which is useful for performing clustering with a data set containing only categorical variables or with a mixed data of categorical and continuous variables.
About the Author:
Alboukadel Kassambara is a PhD in Bioinformatics and Cancer Biology. He works since many years on genomic data analysis and visualization (http://www.alboukadel.com/). He created a bioinformatics web-tool named GenomicScape (www.genomicscape.com) for gene expression data analysis and visualization. He developed also a training website on data science, named STHDA (Statistical Tools for High-throughput Data Analysis, www.sthda.com/english), which contains many tutorials on data analysis and visualization using R software and packages. He is the author of many popular R packages for: 1) multivariate data analysis (factoextra), 2) survival analysis (survminer), 3) correlation analysis (ggcorrplot) and for creating publication ready plots in R (ggpubr). Recently, he published three books on data analysis and visualization: 1) Practical Guide to Cluster Analysis in R (https://goo.gl/DmJ5y5), 2) Guide to Create Beautiful Graphics in R (https://goo.gl/vJ0OYb), 3) Complete Guide to 3D Plots in R (https://goo.gl/v5gwl0).
"About this title" may belong to another edition of this title.