Refining the Concept of Scientific Inference When Working with Big Data

Proceedings of a Workshop

Division on Engineering and Physical Sciences author Board on Mathematical Sciences and Their Applications author Committee on Applied and Theoretical Statistics author National Academies of Sciences, Engineering, and Medicine author Ben A Wender editor

Format:Paperback

Publisher:National Academies Press

Published:24th Mar '17

Currently unavailable, and unfortunately no date known when it will be back

Refining the Concept of Scientific Inference When Working with Big Data cover

The concept of utilizing big data to enable scientific discovery has generated tremendous excitement and investment from both private and public sectors over the past decade, and expectations continue to grow. Using big data analytics to identify complex patterns hidden inside volumes of data that have never been combined could accelerate the rate of scientific discovery and lead to the development of beneficial technologies and products. However, producing actionable scientific knowledge from such large, complex data sets requires statistical models that produce reliable inferences (NRC, 2013). Without careful consideration of the suitability of both available data and the statistical models applied, analysis of big data may result in misleading correlations and false discoveries, which can potentially undermine confidence in scientific research if the results are not reproducible. In June 2016 the National Academies of Sciences, Engineering, and Medicine convened a workshop to examine critical challenges and opportunities in performing scientific inference reliably when working with big data. Participants explored new methodologic developments that hold significant promise and potential research program areas for the future. This publication summarizes the presentations and discussions from the workshop.

Table of Contents
  • Front Matter
  • 1 Introduction
  • 2 Framing the Workshop
  • 3 Inference About Discoveries Based on Integration of Diverse Data Sets
  • 4 Inference About Causal Discoveries Driven by Large Observational Data
  • 5 Inference When Regularization Is Used to Simplify Fitting of High-Dimensional Models
  • 6 Panel Discussion
  • References
  • Appendixes
  • Appendix A: Registered Workshop Participants
  • Appendix B: Workshop Agenda
  • Appendix C: Acronyms
  • <

ISBN: 9780309454445

Dimensions: unknown

Weight: unknown

114 pages