Getting a table to talk to Spark and Cassandra
The DataStax spark cassandra connector is great for interfacing with Cassandra via Apache Spark. With Spark SQL 1.1, we can use lean server to interact with Spark with Tableau. Since Tableau can talk to Spark and Spark can talk to Cassandra, there must be some way to get Tableau to talk to Cassandra via Spark (or rather Spark SQL). I cannot figure out how to do this. Ideally I would like to do this with the Spark Standalone cluster + cassandra cluster (i.e. no additional hadoop setup). Is it possible? Any pointers are appreciated.
source to share
HiveThriftServer has an option HiveThriftServer2.startWithContext(sqlContext)
, so you can create your sqlContext referencing C * and the corresponding table / CF, and then pass that context to the lean server.
So something like this:
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.catalyst.types._
import java.sql.Date
val sparkContext = sc
import sparkContext._
val sqlContext = new HiveContext(sparkContext)
import sqlContext._
makeRDD((1,"hello") :: (2,"world") ::Nil).toSchemaRDD.cache().registerTempTable("t")
import org.apache.spark.sql.hive.thriftserver._
HiveThriftServer2.startWithContext(sqlContext)
So, instead of running Spark's default lean server, you can just dine on that you cusotm alone.
source to share