Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-23238

Additional test and checks for null references on ScannerCallableWithReplicas

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.2.12
    • 3.0.0-alpha-1, 2.3.0, 1.6.0, 2.2.3, 2.1.9
    • None
    • None

    Description

      One of our customers running a 1.2 based version is facing NPE when scanning data from a MR job. It happens when the map task is finalising:

      ...
      2019-09-10 14:17:22,238 INFO [main] org.apache.hadoop.mapred.MapTask: Ignoring exception during close for org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader@3a5b7d7e
      java.lang.NullPointerException
              at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.setClose(ScannerCallableWithReplicas.java:99)
              at org.apache.hadoop.hbase.client.ClientScanner.close(ClientScanner.java:730)
              at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.close(TableRecordReaderImpl.java:178)
              at org.apache.hadoop.hbase.mapreduce.TableRecordReader.close(TableRecordReader.java:89)
              at org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase$1.close(MultiTableInputFormatBase.java:112)
              at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:529)
              at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:2039)
      ...
      2019-09-10 14:18:24,601 FATAL [IPC Server handler 5 on 35745] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1566832111959_6047_m_000000_3 - exited : java.lang.NullPointerException
              at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.setClose(ScannerCallableWithReplicas.java:99)
              at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:264)
              at org.apache.hadoop.hbase.client.ClientScanner.possiblyNextScanner(ClientScanner.java:248)
              at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:542)
              at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:371)
              at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:222)
              at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:147)
              at org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase$1.nextKeyValue(MultiTableInputFormatBase.java:139)
      ...
      
      

      After some investigation, we found out that 1.2 based deployments will consistently face this problem under the following conditions:
      1) The sum of all the given row KVs size targeted to be returned in the scan is larger than max result size;
      2) At same time, the scan filter has exhausted, so that all remaining KVs should be filtered and not returned;

      We could simulate this with the UT being proposed in this PR. When checking newer branches, though, I could verify this specific problem is not present on newer branches, I believe it was indirectly sorted by changes from HBASE-17489.

      Nevertheless, I think it would still be useful to have this extra test and checks added as a safeguard measure.
       

      Attachments

        1. HBASE-23238.branch-2.patch
          6 kB
          Wellington Chevreuil

        Activity

          People

            wchevreuil Wellington Chevreuil
            wchevreuil Wellington Chevreuil
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: