Taking a complete backup of Cassandra DB
Since there is no direct way to take the DB backup of entire Cassandra with all the key spaces, this blog will help.
Cassandra by default provides a way to take backups of individual keyspaces with the following command.
COPY table_name to 'table_name.csv' WITH HEADER=TRUE
Doing this for for an entire Cassandra cluster with multiple keyspaces (tables) will be difficult. Following set of commands can make this easier for you
First, create a folder to store the backup CSV files
mkdir cassandrabkp
cd into the folder
cd cassandrabkp
Assuming that you are using bash, and you have installed cqlsh and sed, the following command will find the tables and use them one by one to create a backup CSV file for each keyspace.
for i in $(cqlsh 192.168.134.132 -e "DESCRIBE SCHEMA" | grep "TABLE" | sed 's/CREATE TABLE //g' | sed 's/ (//g') ; do echo "cqlsh 192.168.134.132 -e \"COPY $i to '$i.csv' WITH HEADER=TRUE\"";done | bash
The above process will take some time depending on the size.
Now you can cd out of the folder and then compress it to save space. Since it is just texts, from my experience, you can compress a 13GB backup folder to 2.5GB
cd ..
tar -cvzf cassandrabkp.tar.gz cassandrabkp/
You can extract it later with the following command
tar -xvzf cassandrabkp.tar.gz