Small Summaries for Big Data

Graham Cormode author Ke Yi author

Format:Hardback

Publisher:Cambridge University Press

Published:12th Nov '20

Currently unavailable, and unfortunately no date known when it will be back

Small Summaries for Big Data cover

A comprehensive introduction to flexible, efficient tools for describing massive data sets to improve the scalability of data analysis.

The massive volume of data generated in modern applications requires the ability to build compact summaries of datasets. This introduction aimed at students and practitioners covers algorithms to describe massive data sets from simple sums to advanced probabilistic structures, with applications in big data, data science, and machine learning.The massive volume of data generated in modern applications can overwhelm our ability to conveniently transmit, store, and index it. For many scenarios, building a compact summary of a dataset that is vastly smaller enables flexibility and efficiency in a range of queries over the data, in exchange for some approximation. This comprehensive introduction to data summarization, aimed at practitioners and students, showcases the algorithms, their behavior, and the mathematical underpinnings of their operation. The coverage starts with simple sums and approximate counts, building to more advanced probabilistic structures such as the Bloom Filter, distinct value summaries, sketches, and quantile summaries. Summaries are described for specific types of data, such as geometric data, graphs, and vectors and matrices. The authors offer detailed descriptions of and pseudocode for key algorithms that have been incorporated in systems from companies such as Google, Apple, Microsoft, Netflix and Twitter.

'A very thorough compendium of sketching and streaming algorithms, and an excellent resource for anyone interested in learning about them, understanding how they work and deploying them in applications. Good job!' Piotr Indyk, Massachusetts Institute of Technology

ISBN: 9781108477444

Dimensions: 234mm x 157mm x 19mm

Weight: 510g

278 pages