Cassandra bulk insert operation, internally

I'm looking for the Cassandra / CQL cousin of the common SQL idiom INSERT INTO ... SELECT ... FROM ...

and couldn't find anything to do such an operation programmatically or in CQL. Is it just not supported?

My use case is to make a rather cumbersome copy from one table to another. I don't need any specific concurrent guarantees, but I have a lot of data, so I would like to avoid the additional network overhead of writing a client that fetches data from one table and then issues batches of inserts to another table. I understand that changes will still need to be pushed across the nodes of the Cassandra cluster as per the replication setup, but it seems reasonable to have an "internal" option for bulk action from one table to another. Is there such a thing in CQL or elsewhere? I am currently using Hector to talk to Cassandra.

Edit: Looks like it sstableloader

might be relevant, but horribly low level for what I expect, this is a fairly common use case. Taking only a subset of rows from one table to another also seems less than trivial in this structure.

+3


source to share


1 answer


That's right, this is not natively supported. (Another alternative would be to work with a map / pruning.) The Cassandra API focuses on short scaled application queries rather than batch or analytic queries.



+4


source







All Articles