I am getting errors from pyspark connecting to cassandra because it appears I am using a too old a cassandra:
[idf@node1 python]$ nodetool -h localhost version
ReleaseVersion: 2.0.17
[idf@node1 python]$
[idf@node1 cassandra]$ java --version Unrecognized option: --version Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit.
[idf@node1 cassandra]$ java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
[idf@node1 cassandra]$
I want to upgrade to the latest version. However, I have already collected quite a bit of data and I don't want to lose it. I am using CentOS 7.2 with a single cassandra node. My questions are,
- where is the cassandra data stored on the local file system
- is it correct to assume that I can compress this directory and move it?
Then once I have the data backed up, what is the correct way to upgrade cassandra? Is it
- remove old version completely
- install new version
- copy data back
What is the best practice to do this?