Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-23601

OutputSink.WriterThread exception gets stuck and repeated indefinietly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.2
    • 3.0.0-alpha-1, 2.3.0
    • read replicas
    • None

    Description

      When a WriterThread runs into an exception (ie: NotServingRegionException), the exception is stored in the controller. It is never removed and can not be overwritten either.

       

      public void run()  {
        try {
          doRun();
        } catch (Throwable t) {
          LOG.error("Exiting thread", t);
          controller.writerThreadError(t);
        }
      }

      Thanks to this every time PipelineController.checkForErrors() is called the same old exception is rethrown.

       

      For example in RegionReplicaReplicationEndpoint.replicate there is a while loop that does the actual replicating. Every time it loops, it calls checkForErrors(), catches the rethrown exception, logs it but does nothing about it. This results in ~2GB log files in ~5min in my experience.

       

      My proposal would be to clean up the stored exception when it reaches RegionReplicaReplicationEndpoint.replicate and make sure we restart the WriterThread that died throwing it.

      Attachments

        Activity

          People

            bszabolcs Szabolcs Bukros
            bszabolcs Szabolcs Bukros
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: