Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-10991

Cleanup OpsCenter keyspace fails - node thinks that didn't joined the ring yet

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 2.2.6, 3.0.4, 3.4
    • Legacy/Observability
    • None
    • C* 2.1.12, Debian Wheezy

    • Normal

    Description

      I've C* cluster spread across 3 DCs. Running cleanup on all nodes in one DC always fails:

      root@db1:~# nt cleanup system
      root@db1:~# nt cleanup sync
      root@db1:~# nt cleanup OpsCenter
      Aborted cleaning up atleast one column family in keyspace OpsCenter, check server logs for more information.
      error: nodetool failed, check server logs
      -- StackTrace --
      java.lang.RuntimeException: nodetool failed, check server logs
              at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:292)
              at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:204)
      
      root@db1:~# 
      

      Checked two other DCs and running cleanup there works fine (it didn't fail immediately).

      Output from nodetool status from one node in problematic DC:

      root@db1:~# nt status
      Datacenter: Amsterdam
      =====================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address        Load       Tokens  Owns    Host ID                               Rack
      UN  10.210.3.162   518.54 GB  256     ?       50e606f5-e893-4a3b-86d3-1e5986dceea9  RAC1
      UN  10.210.3.230   532.63 GB  256     ?       7b8fc988-8a6a-4d94-ae84-ab9da9ab01e8  RAC1
      UN  10.210.3.161   538.82 GB  256     ?       d44b0f6d-7933-4a7c-ba7b-f8648e038f85  RAC1
      UN  10.210.3.160   497.6 GB   256     ?       e7332179-a47e-471d-bcd4-08c638ab9ea4  RAC1
      UN  10.210.3.224   334.25 GB  256     ?       92b0bd8c-0a5a-446a-83ea-2feea4988fe3  RAC1
      UN  10.210.3.118   518.34 GB  256     ?       ebddeaf3-1433-4372-a4ca-9c7ba3d4a26b  RAC1
      UN  10.210.3.221   516.57 GB  256     ?       44d67a49-5310-4ab5-b448-a44be350abf5  RAC1
      UN  10.210.3.117   493.83 GB  256     ?       aae92956-82d6-421e-8f3f-22393ac7e5f7  RAC1
      Datacenter: Analytics
      =====================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address        Load       Tokens  Owns    Host ID                               Rack
      UN  10.210.59.124  392.83 GB  320     ?       f770a8cc-b7bf-44ac-8cc0-214d9228dfcd  RAC1
      UN  10.210.59.151  411.9 GB   320     ?       3cc87422-0e43-4cd1-91bf-484f121be072  RAC1
      UN  10.210.58.132  309.8 GB   256     ?       84d94d13-28d3-4b49-a3d9-557ab47e79b9  RAC1
      UN  10.210.58.133  281.82 GB  256     ?       02bd2d02-41c5-4193-81b0-dee434adb0da  RAC1
      UN  10.210.59.86   285.84 GB  256     ?       bc6422ea-22e9-431a-ac16-c4c040f0c4e5  RAC1
      UN  10.210.59.84   331.06 GB  256     ?       a798e6b0-3a84-4ec2-82bb-8474086cb315  RAC1
      UN  10.210.59.85   366.26 GB  256     ?       52699077-56cf-4c1e-b308-bf79a1644b7e  RAC1
      Datacenter: Ashburn
      ===================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address        Load       Tokens  Owns    Host ID                               Rack
      UN  10.195.15.176  534.51 GB  256     ?       c6ac22df-c43a-4b25-b3b5-5e12ce9c69da  RAC1
      UN  10.195.15.177  313.73 GB  256     ?       eafa2a72-84a2-4cdc-a634-3c660acc6af8  RAC1
      UN  10.195.15.163  470.92 GB  256     ?       bcd2a534-94c4-4406-8d16-c1fc26b41844  RAC1
      UN  10.195.15.162  539.82 GB  256     ?       bb649cef-21de-4077-a35f-994319011a06  RAC1
      UN  10.195.15.182  499.64 GB  256     ?       6ce2d14d-9fb8-4494-8e97-3add05bd35de  RAC1
      UN  10.195.15.167  508.48 GB  256     ?       6f359675-852a-4842-9ff2-bdc69e6b04a2  RAC1
      UN  10.195.15.166  490.28 GB  256     ?       1ec5d0c5-e8bd-4973-96d9-523de91d08c5  RAC1
      UN  10.195.15.183  447.78 GB  256     ?       824165b0-1f1b-40e8-9695-e2f596cb8611  RAC1
      
      Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless
      

      Logs from one of the nodes where cleanup fails:

      INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:33,942 CompactionManager.java:388 - Cleanup cannot run before a node has joined the ring
      INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:33,970 CompactionManager.java:388 - Cleanup cannot run before a node has joined the ring
      INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,000 CompactionManager.java:388 - Cleanup cannot run before a node has joined the ring
      INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,027 CompactionManager.java:388 - Cleanup cannot run before a node has joined the ring
      INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,053 CompactionManager.java:388 - Cleanup cannot run before a node has joined the ring
      INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,082 CompactionManager.java:388 - Cleanup cannot run before a node has joined the ring
      INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,110 CompactionManager.java:388 - Cleanup cannot run before a node has joined the ring
      INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,137 CompactionManager.java:388 - Cleanup cannot run before a node has joined the ring
      INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,165 CompactionManager.java:388 - Cleanup cannot run before a node has joined the ring
      INFO  [RMI TCP Connection(158004)-10.210.59.86] 2016-01-09 15:58:34,195 CompactionManager.java:388 - Cleanup cannot run before a node has joined the ring
      

      Problem touches only OpsCenter keyspace. Running command for others works just fine.

      Attachments

        Issue Links

          Activity

            People

              marcuse Marcus Eriksson
              mlowicki mlowicki
              Marcus Eriksson
              Carl Yeksigian
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: