Apache recently released a newer version of Spark i.e Apache Spark2.2. The new version comes with new improvements as well as the addition of new functionalities.
The major addition to this release is Structured Streaming. It has been marked as production ready and its experimental tag has been removed.
Some of the high-level changes and improvements :
- Production ready Structured Streaming
- Expanding SQL functionalities
- New distributed machine learning algorithms in R
- Additional Algorithms in MLlib and GraphX
Spark 2.2 declares Structured Streaming as production ready with additional high-level changes:
- Kafka Source and Sink: In the previous spark version Kafka was supported only as a source but in the current release we can use Kafka both as a Source and a Sink
- Kafka Improvements: Now a cached instance of Kafka producer will be used for writing to KafkaSink thereby reducing latency
- Additional Stateful APIs: Support for complex stateful processing and timeouts using
View original post 339 more words