What’s new in Apache Spark 2.2

Knoldus Blogs

Apache recently released a newer version of Spark i.e Apache Spark2.2. The new version comes with new improvements as well as the addition of new functionalities.

The major addition to this release is Structured Streaming. It has been marked as production ready and its experimental tag has been removed.

Some of the high-level changes and improvements :

  • Production ready Structured Streaming
  • Expanding SQL functionalities
  • New distributed machine learning algorithms in R
  • Additional Algorithms in MLlib and GraphX

Spark 2.2 declares Structured Streaming as production ready with additional high-level changes:

  • Kafka Source and Sink: In the previous spark version Kafka was supported only as a source but in the current release we can use Kafka both as a Source and a Sink
  • Kafka Improvements: Now a cached instance of Kafka producer will be used for writing to KafkaSink thereby reducing latency
  • Additional Stateful APIs: Support for complex stateful processing and timeouts using 

View original post 339 more words


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s