Cassandra 1.2.x to 2.x data center rebuild

2019-05-30 08:36发布

问题:

I'm trying to upgrade from Cassandra 1.2.x to 2.x. The way I normally do upgrades is by bringing up a new data center (this is on EC2, so not much of an issue) and using nodetool rebuild to move the data over to the new data center. Then switch apps over to the new data center, repair, and then shut down the old data center.

However I am having some trouble with this going from 1.2.15.1 to 2.0.7.31. On the 2.x nodes when I run nodetool rebuild us-east-1-2-15-1, instead of starting the rebuild as expected I get the following error.

Exception in thread "main" java.lang.RuntimeException: Error while rebuilding node: Stream failed
        at org.apache.cassandra.service.StorageService.rebuild(StorageService.java:962)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
        at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
        at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
        at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
        at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
        at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
        at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
        at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
        at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
        at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
        at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
        at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
        at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
        at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
        at sun.rmi.transport.Transport$1.run(Transport.java:177)
        at sun.rmi.transport.Transport$1.run(Transport.java:174)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
        at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)

This could possibly be due to something about an incompatibility between 1.2.x and 2.x or just a bug in the version of 2.x I'm trying to use, but I can find no existing instances of anyone else seeing this issue. Any thoughts?

ETA: I also tried adding new 2.x nodes to the existing cluster, planning to then slowly removing the 1.2.x nodes as I replace them, obviously being sure not to use any 2.x features before all 1.2.x nodes are decommissioned. However, this didn't work and I got exactly the same error.

回答1:

The issue, it turns out, is that streaming data between different major versions of Cassandra is not supported, so I must upgrade the nodes in place by following the upgrade procedure described in the documentation rather than trying to upgrade by streaming data to new nodes. Once the upgrade is complete I can, of course, stream the data onto new nodes if I need to, but the major versions need to match for streaming between nodes to work properly.

Thanks to Rob Coli setting me straight on this on IRC.