Scala for Spark

Course Description

Course Overview:

Hadoop Fundamentals is a one-stop course that introduces you to the domain of spark development as well as gives you technical knowhow of the same. At the end of this course you will be able to earn a credential of Spark professional and you will be capable of dealing with Terabyte scale of data and analyze it successfully using spark and its ecosystem. Scala is a condensed version of Java for large scale functional and object-oriented programming. Apache Spark Streaming is an extended component of the Spark API for processing big data sets as real-time streams. Together, Spark Streaming and Scala enable the streaming of big data.

Course Objectives:

Create Spark applications with the Scala programming language.
Use Spark Streaming to process continuous streams of data.
Process streams of real-time data with Spark Streaming.

Pre-requisites:

Programming and scripting experience

Target Audience:

Software Engineers

Course Duration:

21 hours – 3 days

Course Content:

Introduction

Scala Programming in Depth Review

Syntax and structure
Flow control and function

Spark Internals

Resilient Distributed Datasets (RDD)
Spark script to graph to cluster

Overview of Spark Streaming

Streaming architecture
Intervals in streaming
Fault tolerance

Preparing the Development Environment

Installing and configuring Apache Spark
Installing and configuring the Scala IDE
Installing and configuring JDK

Spark Streaming Beginner to Advanced

Working with key/value RDD’s
Filtering RDD’s
Improving Spark scripts with regular expressions
Sharing data on a cluster
Working with network data sets
Implementing BFS algorithms
Creating Spark driver scripts
Tracking in real time with scripts
Writing continuous applications
Streaming linear regression
Using Spark Machine Learning Library

Spark and Clusters

Bundling dependencies and Spark scripts using the SBT tool
Using EMR for illustrating clusters
Optimizing by partitioning RDD’s
Using Spark logs

Integration in Spark Streaming

Integrating Apache Kafka and working with Kafka topics
Integrating Apache Fume and working with pull-based/push-based Flume configurations
Writing a custom receiver class
Integrating Cassandra and exposing data as real-time services

In Production

Packaging an application and running it with Spark-Submit
Troubleshooting, tuning, and debugging Spark Jobs and clusters

Course Customization Options

To request a customized training for this course, please contact us to arrange.

Inquire now

Best selling courses

BUSINESS / FINANCE / BLOCKCHAIN / FINTECH

Establishing Effective Metrics

PROJECT MANAGEMENT / AGILE & SCRUM

Agile Program Management

CLOUD COMPUTING

Introduction to Cloud Computing

CLOUD COMPUTING

Networking in Google Cloud Platform

CYBER SECURITY

Secure coding in PHP

DEV OPS / CONTAINERS

Docker and Kubernetes for Administrator

Scala for Spark

Course Overview:

Course Objectives:

Pre-requisites:

Target Audience:

Course Duration:

Course Content:

Course Customization Options

To request a customized training for this course, please contact us to arrange.

Best selling courses

client premises

virtual learning

trainosys Classroom

Scala for Spark

Course Overview:

Course Objectives:

Pre-requisites:

Target Audience:

Course Duration:

Course Content:

Course Customization Options

To request a customized training for this course, please contact us to arrange.

Related Courses

Related Courses

Best selling courses

Training Inquiry Information

Free video intake

Leave your details

client premises

CHOICE OF LOCATION

COURSE CUSTOMIZATION

VENUE

virtual learning

COST-EFFICIENT

ACCESSIBILE

LEARNING ASSIST

trainosys Classroom

LOCATION

COURSE CUSTOMIZATION

LEARNING EXPERIENCE

Login