Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-11449

Fix Intermittent NPE while getting node labels for queue

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • yarn

    Description

      NPE is thrown in yarn client when trying to check on queue status.

      Partial stack trace:

      Caused by: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): java.lang.NullPointerException

      at java.util.AbstractCollection.addAll(AbstractCollection.java:343)

      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.getNodeLabelsForQueue(AbstractCSQueue.java:961)

      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.getQueueConfigurations(AbstractCSQueue.java:528)

       at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.getQueueInfo(AbstractCSQueue.java:494)

      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.getQueueInfo(LeafQueue.java:472)

      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:1048)

      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.getResourceUsageReport(FiCaSchedulerApp.java:1041)

      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:408)

      at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:142)

      at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.getApplicationResourceUsageReport(RMAppAttemptImpl.java:954)

      at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.createAndGetApplicationReport(RMAppImpl.java:761)

      at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:396)

      at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:224)

      at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:529)

      at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:507)

       

      Issue originates at

      https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java#L599

      because of the lack of Null Check for this.getAccessibleNodeLabels()

      Attachments

        Issue Links

          Activity

            People

              skalva Sunil Kalva
              skalva Sunil Kalva
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: