What does the shuffling phase actually do?
A) As shuffling is the process of bringing the mapper o/p to the reducer o/p, it just brings the specific keys from the mappers to the particular reducers based on the code written in partitioner
eg. the o/p of mapper 1 is {a,1} {b,1}
the o/p of mapper 2 is {a,1} {b,1}
and in my partitioner, I have written that all keys starting with 'a' will go to reducer 1 and all keys starting with 'b will go to reducer 2 so the o/p would be:
reducer 1: {a,1}{a,1}
reducer 2: {b,1}{b,1}
B) Or along with he above process, does it also groups the keys:
So, the o/p would be:
reducer 1: {a,[1,1]}
reducer 2: {b,[1,1]}
In my opinion I think it should be just A point cause groping of keys must take place after sorting because sorting is only done so that reducer can easily point out when one key is ending and the other key is starting. If yes, when does gropping of keys actually happen, please elaborate.
Thanks