Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-18792

Test failure: transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_between_and_cleanup

    XMLWordPrintableJSON

Details

    Description

      The Python dtest Test failure: transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_between_and_cleanup seems to be flaky at least in trunk:

      ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', '-p', '7200', 'cleanup'] exited with non-zero status; exit status: 2; 
      stderr: error: Node is involved in cluster membership changes. Not safe to run cleanup.
      -- StackTrace --
      java.lang.RuntimeException: Node is involved in cluster membership changes. Not safe to run cleanup.
      	at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:4037)
      	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
      	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
      	at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:72)
      	at jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
      	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
      	at java.base/sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:262)
      	at java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
      	at java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
      	at java.management/com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
      	at java.management/com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
      	at java.management/com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
      	at java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:814)
      	at java.management/com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:802)
      	at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1472)
      	at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1310)
      	at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1405)
      	at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
      	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
      	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
      	at java.rmi/sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:360)
      	at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:200)
      	at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:197)
      	at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
      	at java.rmi/sun.rmi.transport.Transport.serviceCall(Transport.java:196)
      	at java.rmi/sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:587)
      	at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:828)
      	at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:705)
      	at java.base/java.security.AccessController.doPrivileged(AccessController.java:399)
      	at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:704)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
      	at java.base/java.lang.Thread.run(Thread.java:833)
      self = <transient_replication_ring_test.TestTransientReplicationRing object at 0x7f6d38295710>
      
          @flaky(max_runs=1)
          @pytest.mark.no_vnodes
          def test_move_forwards_between_and_cleanup(self):
              """Test moving a node forwards past a neighbor token"""
              move_token = '00025'
              expected_after_move = [gen_expected(range(0, 26), range(31, 40, 2)),
                                     gen_expected(range(0, 21, 2), range(31, 40)),
                                     gen_expected(range(1, 11, 2), range(11, 21, 2), range(21, 31)),
                                     gen_expected(range(21, 26, 2), range(26, 40))]
              expected_after_repair = [gen_expected(range(0, 26)),
                                       gen_expected(range(0, 21), range(31, 40)),
                                       gen_expected(range(21, 31),),
                                       gen_expected(range(26, 40))]
      >       self.move_test(move_token, expected_after_move, expected_after_repair)
      
      transient_replication_ring_test.py:291: 
      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
      transient_replication_ring_test.py:268: in move_test
          cleanup_nodes(nodes)
      transient_replication_ring_test.py:43: in cleanup_nodes
          node.nodetool('cleanup')
      ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:1018: in nodetool
          return handle_external_tool_process(p, ['nodetool', '-h', 'localhost', '-p', str(self.jmx_port)] + shlex.split(cmd))
      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
      
      process = <subprocess.Popen object at 0x7f6d376055c0>
      cmd_args = ['nodetool', '-h', 'localhost', '-p', '7200', 'cleanup']
      
          def handle_external_tool_process(process, cmd_args):
              out, err = process.communicate()
              if (out is not None) and isinstance(out, bytes):
                  out = out.decode()
              if (err is not None) and isinstance(err, bytes):
                  err = err.decode()
              rc = process.returncode
          
              if rc != 0:
      >           raise ToolError(cmd_args, rc, out, err)
      E           ccmlib.node.ToolError: Subprocess ['nodetool', '-h', 'localhost', '-p', '7200', 'cleanup'] exited with non-zero status; exit status: 2; 
      E           stderr: error: Node is involved in cluster membership changes. Not safe to run cleanup.
      E           -- StackTrace --
      E           java.lang.RuntimeException: Node is involved in cluster membership changes. Not safe to run cleanup.
      E           	at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:4037)
      E           	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      E           	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
      E           	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      E           	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
      E           	at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:72)
      E           	at jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
      E           	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      E           	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
      E           	at java.base/sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:262)
      E           	at java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
      E           	at java.management/com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
      E           	at java.management/com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
      E           	at java.management/com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
      E           	at java.management/com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
      E           	at java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:814)
      E           	at java.management/com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:802)
      E           	at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1472)
      E           	at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1310)
      E           	at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1405)
      E           	at java.management.rmi/javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
      E           	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      E           	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
      E           	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      E           	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
      E           	at java.rmi/sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:360)
      E           	at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:200)
      E           	at java.rmi/sun.rmi.transport.Transport$1.run(Transport.java:197)
      E           	at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
      E           	at java.rmi/sun.rmi.transport.Transport.serviceCall(Transport.java:196)
      E           	at java.rmi/sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:587)
      E           	at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:828)
      E           	at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:705)
      E           	at java.base/java.security.AccessController.doPrivileged(AccessController.java:399)
      E           	at java.rmi/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:704)
      E           	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
      E           	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
      E           	at java.base/java.lang.Thread.run(Thread.java:833)
      
      ../env3.6/lib/python3.6/site-packages/ccmlib/node.py:2318: ToolError
      

      This hasn't been seen yet on Butler but on ad-hoc PRs. It can also be reproduced with the multiplexer:

      .circleci/generate.sh -p \
        -e REPEATED_DTESTS_COUNT=500 \
        -e REPEATED_DTESTS=transient_replication_ring_test.py::TestTransientReplicationRing::test_move_forwards_between_and_cleanup
      

      Attachments

        Issue Links

          Activity

            People

              brandon.williams Brandon Williams
              adelapena Andres de la Peña
              Brandon Williams
              Berenguer Blasi, Brandon Williams
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: