Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-17298

Fix NPE in DataNode.handleBadBlock and BlockSender

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      There are some NPE issues on the DataNode side of our online environment.

      The detailed exception information is

      2023-12-20 13:58:25,449 ERROR datanode.DataNode (DataXceiver.java:run(330)) [DataXceiver for client DFSClient_NONMAPREDUCE_xxx at /xxx:41452 [Sending block BP-xxx:blk_xxx]] - xxx:50010:DataXceiver error processing READ_BLOCK operation  src: /xxx:41452 dst: /xxx:50010
      java.lang.NullPointerException
              at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:301)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:607)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:152)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:104)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:298)
              at java.lang.Thread.run(Thread.java:748)
      

      NPE Code logic:

      if (!fromScanner && blockScanner.isEnabled()) {
        // data.getVolume(block) is null
        blockScanner.markSuspectBlock(data.getVolume(block).getStorageID(),
            block);
      } 
      
      2023-12-20 13:52:18,844 ERROR datanode.DataNode (DataXceiver.java:run(330)) [DataXceiver for client /xxx:61052 [Copying block BP-xxx:blk_xxx]] - xxx:50010:DataXceiver error processing COPY_BLOCK operation  src: /xxx:61052 dst: /xxx:50010
      java.lang.NullPointerException
              at org.apache.hadoop.hdfs.server.datanode.DataNode.handleBadBlock(DataNode.java:4045)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.copyBlock(DataXceiver.java:1163)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opCopyBlock(Receiver.java:291)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:113)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:298)
              at java.lang.Thread.run(Thread.java:748)
      

      NPE Code logic:

      // Obtain a reference before reading data
      volumeRef = datanode.data.getVolume(block).obtainReference(); //datanode.data.getVolume(block) is null  
      

      We need to fix it.

      Attachments

        Issue Links

          Activity

            People

              haiyang Hu Haiyang Hu
              haiyang Hu Haiyang Hu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: