How Does Spark Use MapReduce?

Knoldus Blogs

In this talk we will talk about a interesting scenario did spark use mapreduce or not?answer to the question is yes,it use mapreduce but only the idea not the exact implementation lets talk about a example to read a text file from spark what we all do is

spark.sparkContext.textFile("fileName")

but do you know how does it actually works try to control click on this method text file you will find this code

/**
 * Read a text file from HDFS, a local file system (available on all nodes), or any
 * Hadoop-supported file system URI, and return it as an RDD of Strings.
 */
def textFile(
    path: String,
    minPartitions: Int = defaultMinPartitions): RDD[String] = withScope {
  assertNotStopped()
  hadoopFile(path, classOf[TextInputFormat], classOf[LongWritable], classOf[Text],
    minPartitions).map(pair => pair._2.toString).setName(path)
}

as you can see it is calling the hadoopfile method of hadoop api with four parameters one is file path,other one is the input format,longwritable…

View original post 128 more words

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s