How do I insert strings into cassandra if they don't exist using sparkcassandra driver?

I want to write to cassandra from a dataframe, and I want to exclude rows if a certain row already exists (eg Primary key, although upserts happen, I don't want to change other columns) using the spark-cassandra connector. Is there a way to do this?

Thank.

+1


source to share


3 answers


You can use the ifNotExists option WriteConf

that was introduced in this pr .

It works like this:



val writeConf = WriteConf(ifNotExists = true)
rdd.saveToCassandra(keyspaceName, tableName, writeConf = writeConf)

      

+1


source


Srinu, it all boils down to read-before-write whether you are using Spark or not.

But there is a suggestion IF NOT EXISTS

:



If the column exists, it is updated. The string is created if none exists. Use IF NOT EXISTS to perform an insert only if the row does not already exist. Using IF NOT EXISTS leads to performance penalties associated with using Paxos internally. For information on Paxos, see the Cassandra 2.1 documentation or the Cassandra 2.0 documentation.

http://docs.datastax.com/en/cql/3.1/cql/cql_reference/insert_r.html

0


source


You can do

sparkConf.set("spark.cassandra.output.ifNotExists", "true")

      

With this config
if partition key and clustering column are same as row which exists in cassandra

:
write will be ignored


elsewrite will be performed

https://docs.datastax.com/en/cql/3.1/cql/cql_reference/insert_r.html#reference_ds_gp2_1jp_xj__if-not-exists

https://github.com/datastax/spark-cassandra-connector/blob/master/doc/reference.md#write-tuning-parameters

0


source







All Articles