Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-19647

Logging cleanups; emit regionname when RegionTooBusyException inside RetriesExhausted... make netty connect/disconnect TRACE-level

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.0-beta-1, 2.0.0
    • Component/s: None
    • Labels:
      None

      Description

      In MR failures, i see this:

      Error: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 365 actions: RegionTooBusyException: 365 times, servers with issues: ve0534.halxg.cloudera.com,16020,1514392912363 at org.apache.hadoop.hbase.client.BatchErrors.makeException(BatchErrors.java:54) at org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:491) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:268) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:225) at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.persist(IntegrationTestBigLinkedList.java:541) at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.map(IntegrationTestBigLinkedList.java:464) at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.map(IntegrationTestBigLinkedList.java:399) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) [2017-12-27 09:09:11.570]Container killed by the ApplicationMaster. [2017-12-27 09:09:11.586]Container killed on request. Exit code is 143 [2017-12-27 09:09:11.600]Container exited with a non-zero exit code 143.

      Its missing the region name which is in the root exception, RegionTooBusyException – we just skip it.

        Attachments

        1. addendum.patch
          7 kB
          Michael Stack
        2. 19647.patch
          19 kB
          Michael Stack

          Issue Links

            Activity

              People

              • Assignee:
                stack Michael Stack
                Reporter:
                stack Michael Stack
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: