Frank Kane's Taming Big Data with Apache Spark and Python
Learn Big Data processing with hands-on Spark tutorials
Format:Paperback
Publisher:Packt Publishing Limited
Published:30th Jun '17
Currently unavailable, and unfortunately no date known when it will be back
This book offers a practical guide to mastering Apache Spark and big data processing using Python, featuring hands-on tutorials and real-world examples.
In Frank Kane's Taming Big Data with Apache Spark and Python, readers are guided through the intricacies of Apache Spark, a powerful tool for big data processing. This book is designed to help data scientists and analysts understand how to leverage Spark for analyzing large datasets, whether on a single machine or across a computing cluster. With a focus on hands-on learning, Frank Kane presents over 15 real-world examples that illustrate the practical applications of Spark in various scenarios.
The book begins by introducing the fundamentals of Spark, including installation and setup on both local systems and clusters. Readers will learn how to identify big data challenges as Spark problems and how to utilize Spark's Resilient Distributed Datasets (RDD) for efficient data processing. Additionally, the book covers advanced topics such as machine learning with Spark's MLlib library, real-time data processing using Spark Streaming, and complex network analysis through the GraphX library.
Frank Kane's Taming Big Data with Apache Spark and Python is perfect for those with some programming experience in Python who wish to delve deeper into big data processing. With a step-by-step approach, readers can progress at their own pace, making it an accessible resource for anyone looking to master the Spark ecosystem and implement real-time Spark projects effectively.
ISBN: 9781787287945
Dimensions: unknown
Weight: unknown
296 pages