Getting a table to talk to Spark and Cassandra

The DataStax spark cassandra connector is great for interfacing with Cassandra via Apache Spark. With Spark SQL 1.1, we can use lean server to interact with Spark with Tableau. Since Tableau can talk to Spark and Spark can talk to Cassandra, there must be some way to get Tableau to talk to Cassandra via Spark (or rather Spark SQL). I cannot figure out how to do this. Ideally I would like to do this with the Spark Standalone cluster + cassandra cluster (i.e. no additional hadoop setup). Is it possible? Any pointers are appreciated.

+3


source to share


1 answer


HiveThriftServer has an option HiveThriftServer2.startWithContext(sqlContext)

, so you can create your sqlContext referencing C * and the corresponding table / CF, and then pass that context to the lean server.

So something like this:



import  org.apache.spark.sql.hive.HiveContext
import  org.apache.spark.sql.catalyst.types._
import  java.sql.Date
val  sparkContext  =  sc
import  sparkContext._
val  sqlContext  =  new  HiveContext(sparkContext)
import  sqlContext._
makeRDD((1,"hello") :: (2,"world") ::Nil).toSchemaRDD.cache().registerTempTable("t")
import  org.apache.spark.sql.hive.thriftserver._
HiveThriftServer2.startWithContext(sqlContext)

      

So, instead of running Spark's default lean server, you can just dine on that you cusotm alone.

+3


source







All Articles