Reading Avro into spark using spark avro

I cannot read spark files using spark-avro library. Here are the steps I took:

  • Got the jar from: http://mvnrepository.com/artifact/com.databricks/spark-avro_2.10/0.1
  • Induced spark sheath using spark-shell --jars avro/spark-avro_2.10-0.1.jar

  • The executed commands are listed in the git readme:

    import com.databricks.spark.avro._
    import org.apache.spark.sql.SQLContext
    val sqlContext = new SQLContext(sc)
    val episodes = sqlContext.avroFile("episodes.avro") 
    
          

  • SqlContext.avroFile ("episodes.avro") action fails with the following error:

    scala> val episodes = sqlContext.avroFile("episodes.avro")
    java.lang.IncompatibleClassChangeError: class com.databricks.spark.avro.AvroRelation has interface org.apache.spark.sql.sources.TableScan as super class
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
    
          

+3


source to share


1 answer


My bad one. The readme clearly states:

Versions

Spark changed how it reads / writes data in 1.4, so please use the correct version of this dedicated for your spark version

1.3 -> 1.0.0

1.4+ -> 1.1.0-SNAPSHOT

      



I used a spark: 1.3.1

and avro-spark The: 1.1.0

. When I used spark-avro: 1.0.0

it worked.

+6


source







All Articles