Accessing a key from mapValues โโor flatMapValues?
In Spark 1.3, does key access exist from mapValues
?
In particular, if I have
val y = x.groupBy(someKey)
val z = y.mapValues(someFun)
can someFun
find out which key y it is currently working on?
Or do I need to do
val y = x.map(r => (someKey(r), r)).groupBy(_._1)
val z = y.mapValues{ case (k, r) => someFun(r, k) }
Note. The reason I want to use mapValues
and not map
is to keep the section.
+3
source to share
3 answers
You cannot use a key with mapValues
. But you can keep the separation with mapPartitions
.
val pairs: Rdd[(Int, Int)] = ???
pairs.mapPartitions({ it =>
it.map { case (k, v) =>
// your code
}
}, preservesPartitioning = true)
Be careful to actually save the partitioning, the compiler won't be able to check it.
+2
source to share