buchspektrum Internet-Buchhandlung

Neuerscheinungen 2018

Stand: 2020-02-01
Schnellsuche
ISBN/Stichwort/Autor
Herderstraße 10
10625 Berlin
Tel.: 030 315 714 16
Fax 030 315 714 14
info@buchspektrum.de

Subhashini Chellappan, Dharanitharan Ganesan (Beteiligte)

Practical Apache Spark


Using the Scala API
1st ed. 2018. xvi, 280 S. 303 SW-Abb. 254 mm
Verlag/Jahr: SPRINGER, BERLIN; APRESS 2018
ISBN: 1-484-23651-3 (1484236513)
Neue ISBN: 978-1-484-23651-2 (9781484236512)

Preis und Lieferzeit: Bitte klicken


Work with Apache Spark using Scala to deploy and set up single-node, multi-node, and high-availability clusters. This book discusses various components of Spark such as Spark Core, DataFrames, Datasets and SQL, Spark Streaming, Spark MLib, and R on Spark with the help of practical code snippets for each topic. Practical Apache Spark also covers the integration of Apache Spark with Kafka with examples. You´ll follow a learn-to-do-by-yourself approach to learning - learn the concepts, practice the code snippets in Scala, and complete the assignments given to get an overall exposure.
On completion, you´ll have knowledge of the functional programming aspects of Scala, and hands-on expertise in various Spark components. You´ll also become familiar with machine learning algorithms with real-time usage.
What You Will Learn

Discover the functional programming features of Scala

Understand the complete architecture of Spark and its components
Integrate Apache Spark with Hive and Kafka
Use Spark SQL, DataFrames, and Datasets to process data using traditional SQL queries

Work with different machine learning concepts and libraries using Spark´s MLlib packages

Who This Book Is For
Developers and professionals who deal with batch and stream data processing.
Chapter 1: Scala - Functional Programming Aspects

Chapter 2: Single & Multi-node cluster setup

Chapter 3: Introduction to Apache Spark and Spark Core

Chapter 4: Spark SQL, Dataframes & Datasets

Chapter 5: Introduction to Spark Streaming

Chapter 6: Spark Structured Streaming

Chapter 7: Spark Streaming with Kafka

Chapter 8: Spark Machine Learning Library

Chapter 9: Working with SparkR

Chapter 10: Spark - Real time use case

Subhashini Chellappan is an associate manager and technology enthusiast. She has rich experience in both academia and the software industry. She has published two books: Big Data Analytics and Pro Tableau. Her areas of interest and expertise are centered on business intelligence, big data analytics and cloud computing.




Bharath Kumar Dasa is a technology lead, with expertise in the big data space having core expertise in the complete Hadoop stack. Had worked on HDP distribution and has architected multiple data management and data life cycle auto service management projects for financial institutions. He has been working in machine learning and integration of machine learning with big data technologies for the past few years. His areas of interest and expertise are centered on big data and analytics, machine learning, data visualization and deep learning.