What does the shuffle phase actually do?

What does the shuffle phase actually do?

A) Since shuffling is the process of casting a mapper o / p into an o / p reducer, it just brings specific keys from mappers to specific reducers based on the code written in the delimiter

eg. o / p of mapping 1 is {a, 1} {b, 1}

o / p of mapping 2 is {a, 1} {b, 1}

and in my separator I wrote that all keys starting with 'a' will go to reducer 1 and all keys starting with 'b will go to reducer 2 so that o / p is:

reducer 1: {a, 1} {a, 1}

reducer 2: {b, 1} {b, 1}

B) Or together with it above the process, it also groups the keys:

So o / p would be:

reducer 1: {a, [1,1]}

reducer 2: {b, [1,1]}

In my opinion, I believe it should be just a point. Sorting should look for keys because the sort is only done so that the reducer can easily indicate when one key ends and another key starts. If so, when the search for keys actually occurs, please clarify.

thank

+1


source to share





All Articles