DownloadThe Portobello Bookshop Gift Guide 2024

Statistical Foundations of Data Science

Jianqing Fan author Hui Zou author Runze Li author Cun-Hui Zhang author

Format:Hardback

Publisher:Taylor & Francis Inc

Published:17th Aug '20

Currently unavailable, and unfortunately no date known when it will be back

Statistical Foundations of Data Science cover

Statistical Foundations of Data Science gives a thorough introduction to commonly used statistical models, contemporary statistical machine learning techniques and algorithms, along with their mathematical insights and statistical theories. It aims to serve as a graduate-level textbook and a research monograph on high-dimensional statistics, sparsity and covariance learning, machine learning, and statistical inference. It includes ample exercises that involve both theoretical studies as well as empirical applications.

The book begins with an introduction to the stylized features of big data and their impacts on statistical analysis. It then introduces multiple linear regression and expands the techniques of model building via nonparametric regression and kernel tricks. It provides a comprehensive account on sparsity explorations and model selections for multiple regression, generalized linear models, quantile regression, robust regression, hazards regression, among others. High-dimensional inference is also thoroughly addressed and so is feature screening. The book also provides a comprehensive account on high-dimensional covariance estimation, learning latent factors and hidden structures, as well as their applications to statistical estimation, inference, prediction and machine learning problems. It also introduces thoroughly statistical machine learning theory and methods for classification, clustering, and prediction. These include CART, random forests, boosting, support vector machines, clustering algorithms, sparse PCA, and deep learning.

"This book delivers a very comprehensive summary of the development of statistical foundations of data science. The authors no doubt are doing frontier research and have made several crucial contributions to the field. Therefore, the book offers a very good account of the most cutting-edge development. The book is suitable for both master and Ph.D. students in statistics, and also for researchers in both applied and theoretical data science. Researchers can take this book as an index of topics, as it summarizes in brief many significant research articles in an accessible way. Each chapter can be read independently by experienced researchers. It provides a nice cover of key concepts in those topics and researchers can benefit from reading the specific chapters and paragraphs to get a big picture rather than diving into many technical articles. There are altogether 14 chapters. It can serve as a textbook for two semesters. The book also provides handy codes and data sets, which is a great treasure for practitioners."
~Journal of Time Series Analysis

"This text—collaboratively authored by renowned statisticians Fan (Princeton Univ.), Li (Pennsylvania State Univ.), Zhang (Rutgers Univ.), and Zhou (Univ. of Minnesota)—laboriously compiles and explains theoretical and methodological achievements in data science and big data analytics. Amid today's flood of coding-based cookbooks for data science, this book is a rare monograph addressing recent advances in mathematical and statistical principles and the methods behind regularized regression, analysis of high-dimensional data, and machine learning. The pinnacle achievement of the book is its comprehensive exploration of sparsity for model selection in statistical regression, considering models such as generalized linear regression, penalized least squares, quantile and robust regression, and survival regression. The authors discuss sparsity not only in terms of various types of penalties but also as an important feature of numerical optimization algorithms, now used in manifold applications including deep learning. The text extensively probes contemporary high-dimensional data modeling methods such as feature screening, covariate regularization, graphical modeling, and principal component and factor analysis. The authors conclude by introducing contemporary statistical machine learning, spanning a range of topics in supervised and unsupervised learning techniques and deep learning. This book is a must-have bookshelf item for those with a thirst for learning about the theoretical rigor of data science."
~Choice Review, S-T. Kim, North Carolina A&T State University, August 2021

ISBN: 9781466510845

Dimensions: unknown

Weight: 1260g

774 pages