How does HiveContext / SQLContext spark retrieve schema / data?
I can't find much documentation, but when I fetch data from Hive into Spark SQL, how do I get the schema, is it automatically viewed in the Hive Metastore? Also is there a hive talking spark to look at the file location to pull the data into the DataFrame? And how does it handle the view, or can't it handle the view yet?
+3
source to share
1 answer
- Yes, he is looking for a metaphor for the hive.
- Spark delegates requests to the hive. It grabs the output and turns it into a row data frame. From the docs:
When working with Hive, you need to build a HiveContext that inherits from SQLContext and adds support for searching tables in the MetaStore and writing queries using HiveQL
+4
source to share