Is there a way for Spark to read AWS S3 files without using Hadoop?
Standalone programs can read / write AWS S3 files without Hadoop using AWS jar files. Spark programs can read / write files without Hadoop. However, Spark requires programs that read / write AWS S3 files in order to use Hadoop. And even though Spark 1.4 and Hadoop 2.6 and 2.7 have runtime errors, skip the Hadoop class for S3 even if the Hadoop directory is set.
-
Is there a way for Spark programs to read / write S3 files without using Hadoop using AWS jar files?
-
If not, how do I fix Spark's missing Hadoop class for S3 at runtime?
+3
source to share