Assimilation of Spark Streaming With Kafka

Knoldus Blogs

As we know Spark is used at a wide range of organizations to process large datasets. It seems like spark becoming main stream. In this blog we will talk about Integration of Kafka with Spark Streaming. So, lets get started.

How Kafka can be integrated with Spark?

Kafka provides a messaging and integration platform for Spark streaming. Kafka act as the central hub for real-time streams of data and are processed using complex algorithms in Spark Streaming. Once the data is processed, Spark Streaming could be used to publish results into yet another Kafka topic.

Let’s see how to configure Spark Streaming to receive data from Kafka by creating a SBT project first and add the following dependencies in build.sbt.

val sparkCore = "org.apache.spark" % "spark-core_2.11" % "2.2.0"
val sparkSqlKafka = "org.apache.spark" % "spark-sql-kafka-0-10_2.11" % "2.2.0"
val sparkSql = "org.apache.spark" % "spark-sql_2.11" % "2.2.0"

libraryDependencies ++= Seq(sparkCore, sparkSql, sparkSqlKafka)

View original post 171 more words

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s