ClassNotFoundException when submitting JAR to Spark via spark-submit

I am trying to send JARs to Apache Spark using spark-submit

.

To keep things simple, I experimented with this post. Code

import org.apache.spark.SparkContext
import org.apache.spark.SparkConf

object SimpleScalaSpark { 
  def main(args: Array[String]) {
    val logFile = "/Users/toddmcgrath/Development/spark-1.6.1-bin-hadoop2.4/README.md" // I've replaced this with the path to an existing file
    val conf = new SparkConf().setAppName("Simple Application").setMaster("local[*]")
    val sc = new SparkContext(conf)
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
  }
}

      

I am running creating this with Intellij Idea 2017.1 and is running on Spark 2.1.0. Everything works fine when I run it in the IDE.

Then I create it as a JAR and try to use it spark-submit

like this

./spark-submit --class SimpleScalaSpark --master local[*] ~/Documents/Spark/Scala/supersimple/out/artifacts/supersimple_jar/supersimple.jar

      

This results in the following error

java.lang.ClassNotFoundException: SimpleScalaSpark
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.spark.util.Utils$.classForName(Utils.scala:229)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:695)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

      

I don't understand what I am missing ... especially considering that it works as expected in the IDE.

+3


source to share


4 answers


As per your description above, you are not supplying the correct class name, so it cannot find this class.

Just replace SimpleSparkScala with SimpleScalaSpark



Try this command:

./spark-submit --class SimpleScalaSpark --master local [*] ~ / Documents / Spark / Scala / supersimple / out / artifacts / supersimple_jar / supersimple.jar

+1


source


It looks like there is a problem with your bank. You can check which classes are present in your jar using the command: vi supersimple.jar



If the SimpleScalaSpark class does not appear in the output of the previous command, it means that your jar is not built properly.

+1


source


IDEs work differently compared to shell. I believe for the shell you need to add the -jars parameter

spark submit add some jars on the classpath

0


source


I am observing ClassNotFound for new classes that I present. I am using a thick jar. I have verified that the JAR file contains the new class file in all instances in each node. (I'm using the regular filesystem to load my Spark app, not hdfs or http urls). No new class was introduced in the JAR file uploaded by the worker. This is an older version. The only way to find the problem is to use a different filename for the JAR every time I call the spark-submit script.

0


source







All Articles