Accessing Cassandra Nodes in Spark

I have two Cassandra nodes and I am developing a Java-Spark application.

I have one Spark Master and two slaves. The following code is used to connect to one Cassandra node:

sparkConf.set("spark.cassandra.connection.host", "server");

      

How do I add additional Cassandra nodes?

+3


source to share


2 answers


The documentation is simple enough:

new SparkConf(true)
   .set("spark.cassandra.connection.host", "192.168.123.10")

      

And just below:



"Multiple hosts can be transferred using a comma separated list (" 127.0.0.1,127.0.0.2 "). These are just the initial contact points, all nodes in the local DC will be used when connecting."

In other words, you just need to connect to a Spark master that knows about other machines in the cluster through the resource manager. A comma separated list is useful when you want to connect to multiple clusters.

+2


source


You can try this if you are using scala. I couldn't find anything else in Python.

val connectorToClusterOne = CassandraConnector(sc.getConf.set("spark.cassandra.connection.host", "127.0.0.1"))
val connectorToClusterTwo = CassandraConnector(sc.getConf.set("spark.cassandra.connection.host", "127.0.0.2"))


implicit val c = connectorToClusterOne
sc.cassandraTable("ks","tab")

implicit val c = connectorToClusterTwo
rddFromClusterOne.saveToCassandra("ks","tab")

      



Good luck !!

+2


source







All Articles