How to keep the same order of schema columns with Spark Dataset map?

I am reading data from a Hive table and then trying to enrich it with an additional column that I got from other columns. But I'm having trouble with Spark changing my schema and ordering all columns by name.

After calling withColumn () and coding it with my enriched class, the schema is correct, but whenever I call map (), the schema and column order changes are wrong. How can I tell Spark to keep the original column order?

session.table("myTable")
    .as(Encoders.bean(Base.class))
    .withColumn("enrichedColumn", lit(""))
    .as(Encoders.bean(Enriched.class))
    .map(enriched -> enriched.enrich(), Encoders.bean(Enriched.class))
    .printSchema();

      

+3
java dataset apache-spark


source to share


No one has answered this question yet

Check out similar questions:

3073
How to efficiently iterate over each entry in a Java map?
1070
How can I initialize a static map?
324
Java class that implements map and preserves insert order?
25
Create new column with function in Spark Dataframe
3
How to check the content of a Spark Dataframe
2
Factorize the spark column
1
Spark createDataframe from RDD objects, column order
0
Add derived column (as array of structure) based on values ​​and ordering of other columns in Spark Scala dataframe
0
Is there a good (immutable) way to pre-define a column for an RDD, or remove a column from an RDD?
0
How to move selected columns of DataFrame to the end (rearranging column positions)?



All Articles
Loading...
X
Show
Funny
Dev
Pics