Why doesn't spark-shell load the imported RDD class file?

I am using Spark 2.1.1 with Scala 2.11.8.

Internally spark-shell

I am using command :load

to load class with methods with RDD.

When I try to load the class, I get the following compilation error:

error: not found: type RDD

Why? I have an import statement.

image

This is the code I am working with

image1

+3


source to share


1 answer


This seems like a feature :load

in spark-shell

. The solution is to move import org.apache.spark.rdd.RDD

(no dot and underline) to your class definition.

This does not apply to the class RDD

, but all imported classes. It won't work if the operator is import

not defined inside the class itself.

With that said, the following won't work due to the import being outside the class.

import org.apache.spark.rdd.RDD
class Hello {
  def get(rdd: RDD[String]): RDD[String] = rdd
}

scala> :load hello.scala
Loading hello.scala...
import org.apache.spark.rdd.RDD
<console>:12: error: not found: type RDD
         def get(rdd: RDD[String]): RDD[String] = rdd
                                    ^
<console>:12: error: not found: type RDD
         def get(rdd: RDD[String]): RDD[String] = rdd
                      ^

      

You can see what's going on under the covers using the flag -v

:load

.



scala> :load -v hello.scala
Loading hello.scala...

scala>

scala> import org.apache.spark.rdd.RDD
import org.apache.spark.rdd.RDD

scala> class Hello {
     |   def get(rdd: RDD[String]): RDD[String] = rdd
     | }
<console>:12: error: not found: type RDD
         def get(rdd: RDD[String]): RDD[String] = rdd
                                    ^
<console>:12: error: not found: type RDD
         def get(rdd: RDD[String]): RDD[String] = rdd
                      ^

      

This got me guessing what importing inside a class definition might help. And it was! (much to my surprise)

class Hello {
  import org.apache.spark.rdd.RDD
  def get(rdd: RDD[String]): RDD[String] = rdd
}

scala> :load -v hello.scala
Loading hello.scala...

scala> class Hello {
     |   import org.apache.spark.rdd.RDD
     |   def get(rdd: RDD[String]): RDD[String] = rdd
     | }
defined class Hello

      

You can also use the command :paste

to insert a class into spark-shell

. There's so-called raw mode where you can define classes in your own package.

package mypackage

class Hello {
  import org.apache.spark.rdd.RDD
  def get(rdd: RDD[String]): RDD[String] = rdd
}

scala> :load -v hello.scala
Loading hello.scala...

scala> package mypackage
<console>:1: error: illegal start of definition
package mypackage
^

scala>

scala> class Hello {
     |   import org.apache.spark.rdd.RDD
     |   def get(rdd: RDD[String]): RDD[String] = rdd
     | }
defined class Hello

scala> :paste -raw
// Entering paste mode (ctrl-D to finish)

package mypackage

class Hello {
  import org.apache.spark.rdd.RDD
  def get(rdd: RDD[String]): RDD[String] = rdd
}

// Exiting paste mode, now interpreting.

      

+6


source







All Articles