How does Spark in Java compare two keys when doing a join or groupWith?
I am trying to do the following:
JavaPairRDD<JsonObject, JsonObject> rdd1 = ..
JavaPairRDD<JsonObject, String> rdd2 = ..
JavaPairRDD<JsonObject, Tuple2<Iterable<String>, Iterable<JsonObject>>>
groupedRDD = rdd1.groupWith(rdd2);
But I'm not sure how Spark will compare the two JsonObject keys.
More generally, how do you compare keys when doing a join or group with?
+3
source to share
1 answer
It uses Java method .equals()
.
Thing is equals()
not implemented in JsonObject
. Therefore, it will use the default Java implementation, which only compares object references.
The equals method for the Object class implements the most diverse possible equivalence relation for objects; that is, for any non-empty pivot x and y, this method returns true if and only if x and y refer to the same object (x == y is true).
+2
source to share