How to convert Dataset to JavaPairRDD?

There are methods for converting Dataset to JavaRDD.

Dataset<Row> dataFrame;
JavaRDD<String> data = dataFrame.toJavaRDD();

      

Are there any other ways to convert Dataset to javaPairRDD<Long, Vector>

?

+3


source to share


1 answer


You can use PairFunction

as below. Please check the index of the item in your dataset. In the example below, index 0 has a long value and index 3 has a vector.



JavaPairRDD<Long, Vector> jpRDD = dataFrame.toJavaRDD().mapToPair(new PairFunction<Row, Long, Vector>() {
    public Tuple2<Long, Vector> call(Row row) throws Exception {
        return new Tuple2<Long, Vector>((Long) row.get(0), (Vector) row.get(3));
    }
});

      

+7


source







All Articles