Why is new Job () throw java.lang.IllegalStateException: Working in DEFINE state instead of RUNNING?

An attempt to write data to Parquet in Spark 1.1.1 .

I follow the Powerful Trio for Big Data: Spark, Parquet and Avro as a template. The code in this article uses a job setup to call the ParquetOutputFormat API method.

scala> import org.apache.hadoop.mapreduce.Job 
scala> val job = new Job() 
java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
    at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:283)
    at org.apache.hadoop.mapreduce.Job.toString(Job.java:452)
    at scala.runtime.ScalaRunTime$.scala$runtime$ScalaRunTime$$inner$1(ScalaRunTime.scala:324)
    at scala.runtime.ScalaRunTime$.stringOf(ScalaRunTime.scala:329)
    at scala.runtime.ScalaRunTime$.replStringOf(ScalaRunTime.scala:337)
    at .<init>(<console>:10)
    at .<clinit>(<console>)
    ...

      

+3


source to share


1 answer


Spark and MapReduce work differently, and Spark uses MapReduce InputFormat for HDFS input.



From the stack trace, the error comes from the toString method. Try to start the job with spark-submit rather than spark shell. It should resolve the error from toString.

0


source







All Articles