About Spark UnsafeShuffleWriter
I have two questions about the UnsafeShuffleWriter UnsafeShuffleWriter
will be used when all three of the following conditions are met:
- A shuffle dependency does not indicate an aggregation or output ordering.
- The shuffle serializer supports wrapping of serialized values โโ(currently supported by the special serializers KryoSerializer and Spark SQL).
- Shuffle produces less than 16,777,216 output sections.
I am confused about the first two conditions.
- Why would a random permutation not indicate an ordering or an output ordering? I find it a good idea to use
UnsafeShuffleWriter
if mapSideCombine = false, regardless of whether aggregation or ordering is specified. - Why does the serializer have to support wrapping the serialized values โโwhere movement will be used?
+3
source to share
No one has answered this question yet
Check out similar questions: