ClassNotFoundException when submitting JAR to Spark via spark-submit
I am trying to send JARs to Apache Spark using spark-submit
.
To keep things simple, I experimented with this post. Code
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
object SimpleScalaSpark {
def main(args: Array[String]) {
val logFile = "/Users/toddmcgrath/Development/spark-1.6.1-bin-hadoop2.4/README.md" // I've replaced this with the path to an existing file
val conf = new SparkConf().setAppName("Simple Application").setMaster("local[*]")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
}
I am running creating this with Intellij Idea 2017.1 and is running on Spark 2.1.0. Everything works fine when I run it in the IDE.
Then I create it as a JAR and try to use it spark-submit
like this
./spark-submit --class SimpleScalaSpark --master local[*] ~/Documents/Spark/Scala/supersimple/out/artifacts/supersimple_jar/supersimple.jar
This results in the following error
java.lang.ClassNotFoundException: SimpleScalaSpark
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:229)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:695)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I don't understand what I am missing ... especially considering that it works as expected in the IDE.
source to share
As per your description above, you are not supplying the correct class name, so it cannot find this class.
Just replace SimpleSparkScala with SimpleScalaSpark
Try this command:
./spark-submit --class SimpleScalaSpark --master local [*] ~ / Documents / Spark / Scala / supersimple / out / artifacts / supersimple_jar / supersimple.jar
source to share
I am observing ClassNotFound for new classes that I present. I am using a thick jar. I have verified that the JAR file contains the new class file in all instances in each node. (I'm using the regular filesystem to load my Spark app, not hdfs or http urls). No new class was introduced in the JAR file uploaded by the worker. This is an older version. The only way to find the problem is to use a different filename for the JAR every time I call the spark-submit script.
source to share