When Cassandra removes data from SSTable

In Cassandra 2.x, when I delete one or more columns, they get a tombstone in Memtable, but no data is deleted. At some point the Memtable is dumped to the SSTable including deleted data and tombstone. When compaction is performed, it will keep the headstone with the specified grace period. What happens to the data? I removed a bunch of columns last week - less than gc_grace_seconds ago. I'm not sure if the seal hasn't started yet. I haven't seen any changes in the size of the disk yet, so I was wondering where is the data physically deleted from the disk?

+3


source to share


1 answer


In Cassandra 2.x, when I delete one or more columns, they get a tombstone in Memtable, but the data is not deleted. At some point the Memtable is dumped to the SSTable including deleted data and tombstone. When compaction is performed, it will keep the headstone with the specified grace period.

True.

What happens to the data?

The data will remain on disk, at least for gc_grace_seconds. The next little compression right after gc_grace_seconds might remove it, but the actual timing depends mostly on your dataset and workload type.



I deleted a bunch of columns last week - less than gc_grace_seconds ago. I'm not sure if the seal hasn't started yet. I haven't seen any changes in the size of the disk yet, so I was wondering where is the data physically deleted from the disk?

If you want to free up disk space, you can:

  • Wait for gc_grace_seconds for normal fine compaction.
  • run nodetool compact

    , which will run the main compression on the current free disk drive node.
+3


source







All Articles