Why "error: not found: StructType value" when creating sql schema?
I have CDH5 version 1.0.0 Spark installed on CentOS 6.2 and working without error.
When trying to run some Spark SQL, I am getting an error. I'm starting my spark shell fine ...
spark-shell --master spark://mysparkserver:7077
scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc) scala> val vehicle = sc.textFile("/tmp/scala.csv") scala> val schemaString = "year manufacturer model class engine cylinders fuel consumption clkm hlkm cmpg hmpg co2lyr co2gkm" scala> import org.apache.spark.sql._ scala > val schema = StructType ( schemaString.split(" ").map(fieldName => StructField(fieldName, StringType, true)) )
But the import statement didn't seem to work? Since the last line gives an error that
scala> StructType <console>:14: error: not found: value StructType StructType ^
I know there
. And if I replace
in the schema line with a fully qualified name, the error changes.
Has anyone else encountered this error? Is there an extra step I am missing?
source to share
Your problem is that you are reading the programming manual for the latest version of Spark and testing it against Spark 1.0.0. Alas, it
was introduced in Spark 1.1.0, just like in the "Programmatically Specifying the Schema" section.
So without upgrading, you cannot do this, unless you can use the methods in section 1.0.0 of the "Running SQL on RDD" manual, which is in 1.1. 0 is called "Outputting a Circuit Using Reflection". (Basically, if you can tolerate a fixed circuit.)
If you look at the various documentation URLs you want to replace
. When in doubt, I like to cite multiple versions of the API document and search. I notice that, like javadoc, scaladoc is annotated
to make this information clearer in the API docs, but it is not used in the Spark API docs.
source to share