Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-8798

don't throw TombstoneOverwhelmingException during bootstrap

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Won't Fix
    • None
    • None
    • None
    • Normal

    Description

      During bootstrap honouring tombstone_failure_threshold seems counter-productive as the node is not serving requests so not protecting anything.

      Instead what happens is bootstrap fails, and a cluster that obviously needs an extra node isn't getting it...

      *History*
      When adding a new node bootstrap process looks complete in that streaming is finished, compactions finished, and all disk and cpu activity is calm.
      But the node is still stuck in "joining" status.

      The last stage in the bootstrapping process is the rebuilding of secondary indexes. grepping the logs confirmed it failed during this stage.

      grep SecondaryIndexManager cassandra/logs/*

      To see what secondary index rebuilding was initiated

      grep "index build of " cassandra/logs/* | awk -F" for data in " '{print $1}'
      
      INFO 13:18:11,252 Submitting index build of addresses.unobfuscatedIndex
      INFO 13:18:11,352 Submitting index build of Inbox.FINNBOXID_INDEX
      INFO 23:03:54,758 Submitting index build of [events.collected_tbIndex, events.real_tbIndex]
      

      To get an idea of successful secondary index rebuilding

      grep "Index build of "cassandra/logs/*
      
      INFO 13:18:11,263 Index build of addresses.unobfuscatedIndex complete
      INFO 13:18:11,355 Index build of Inbox.FINNBOXID_INDEX complete
      

      Looking closer at [events.collected_tbIndex, events.real_tbIndex] showed the following stacktrace

      ERROR [StreamReceiveTask:121] 2015-02-12 05:54:47,768 CassandraDaemon.java (line 199) Exception in thread Thread[StreamReceiveTask:121,5,main]
      java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
              at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:413)
              at org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:142)
              at org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:130)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
              at java.util.concurrent.FutureTask.run(FutureTask.java:262)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
              at java.util.concurrent.FutureTask.report(FutureTask.java:122)
              at java.util.concurrent.FutureTask.get(FutureTask.java:188)
              at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:409)
              ... 7 more
      Caused by: java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
              at org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:160)
              at org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:143)
              at org.apache.cassandra.db.Keyspace.indexRow(Keyspace.java:406)
              at org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
              at org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:834)
              ... 5 more
      Caused by: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
              at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:202)
              at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
              at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
              at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
              at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
              at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
              at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
              at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
              at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:333)
              at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
              at org.apache.cassandra.service.pager.SliceQueryPager.queryNextPage(SliceQueryPager.java:85)
              at org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:88)
              at org.apache.cassandra.service.pager.SliceQueryPager.fetchPage(SliceQueryPager.java:35)
              at org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:154)
              ... 9 more
      

      To get past this i had to raise org.apache.cassandra.db:type=StorageService.TombstoneFailureThreshold and manually rebuild the index. Then restart the node with auto_bootstrap=false

      Attachments

        Activity

          People

            Unassigned Unassigned
            mck Michael Semb Wever
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: