Spark saveAsNewAPIHadoopFile works in local mode but not in cluster mode
After upgrading to CDH5.4 and Spark streaming 1.3, I ran into a strange issue where saveAsNewAPIHadoopFile no longer saves files to HDFS as it is supposed to. I can see that a _temp directory has been generated, but when Save is finished, the _temp is removed and leaves the directory empty with just a SUCCESS file. I got the feeling that the files were generated, but then they could not be removed from the _temp directory until the _temp was removed.
This issue only occurs when working in Spark Cluster (offline). If I run the job using local spark, the files are saved as expected.
Some help would be appreciated.
+3
source to share