Can't run Spark inside scala desktop in Intellij Idea

The following code runs without issue if I put it inside an object that extends the app trait and launches it using the Idea command run

.

However, when I try to run it from sheet, I come across one of these scenarios:

1- If the first line is present, I get:

The task is not serializable: java.io.NotSerializableException: A $ A34 $ A $ A34

2- If the first line is commented out I get:

It is not possible to create an encoder for an internal class A $ A35 $ A $ A35 $ A12 without access to the scope in which the class was defined.

//First line!
org.apache.spark.sql.catalyst.encoders.OuterScopes.addOuterScope(this)

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.types.{IntegerType, StructField, StructType}

case class AClass(id: Int, f1: Int, f2: Int)
val spark = SparkSession.builder()
  .master("local[*]")
  .appName("Test App")
  .getOrCreate()
import spark.implicits._

val schema = StructType(Array(
  StructField("id", IntegerType),
  StructField("f1", IntegerType),
  StructField("f2", IntegerType)))

val df = spark.read.schema(schema)
  .option("header", "true")
  .csv("dataset.csv")

// Displays the content of the DataFrame to stdout
df.show()
val ads = df.as[AClass]

//This is the line that causes serialization error
ads.foreach(x => println(x))

      

The project was built using the Idea Scala plugin and this is my build.sbt:

   ...
   scalaVersion := "2.10.6"
   scalacOptions += "-unchecked"
   libraryDependencies ++= Seq(
       "org.apache.spark" % "spark-core_2.10" % "2.1.0",
       "org.apache.spark" % "spark-sql_2.10" % "2.1.0",
       "org.apache.spark" % "spark-mllib_2.10" % "2.1.0"
       )

      

I tried the solution in this answer. But it doesn't work for Idea Ultimate 2017.1 which I am using and also when I use worksheets I prefer not to add an extra object to the worksheet if at all possible.

if I use a method collect()

on a dataset object and get an array of Aclass instances, there are no more errors. It tries to work with DS directly, which causes an error.

+3


source to share


1 answer


Use eclipse compatibility mode (open Preferences-> type scala -> in languages ​​and Framework, select scala -> Choose Worksheet -> select eclipse compatibility mode) see https://gist.github.com/RAbraham/585939e5390d46a7d6f8



+1


source







All Articles