Correct order of various phases of MR job?

2019-07-27 08:50发布

I am trying to understand the various phases which a MR Job goes through. I read online documentation for the same.

Based on this, my understand on the sequence is as below:

map() -> Partitioner -> Sorting (at mapper machine) -> Shuffle -> Sorting (at reducer machine) -> groupBy(Key) (at reducer machine) -> reduce()

Is this the correct sequence in which a MR Job executes?

标签： hadoop mapreduce yarn hadoop2

2条回答

2楼-- · 2019-07-27 09:13

Various phases of a map reduce job:

Map phase:

Partition phase

Shuffle phase

Fetches input data from all map tasks for the portion corresponding to the reduce task's bucket

Sort phase

Reduce phase

0人赞添加讨论(0) 举报

3楼-- · 2019-07-27 09:15

Timeline of Map Reduce Job

Timeline for MapTask

Timeline for ReduceTask

0人赞添加讨论(0) 举报