Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.0.0-beta1
-
None
-
None
Description
In an HA enabled cluster (3.0), we found that RM is failing to start with an NPE from ActiveStandbyElector. Zookeeper was down at this time, hence client retries were coming for a while
2017-12-13 18:21:22,460 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2017-12-13 18:21:22,544 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService failed in state INITED; cause: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.ha.ActiveStandbyElector$3.run(ActiveStandbyElector.java:1039) at org.apache.hadoop.ha.ActiveStandbyElector$3.run(ActiveStandbyElector.java:1036) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1101) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1093) at org.apache.hadoop.ha.ActiveStandbyElector.createWithRetries(ActiveStandbyElector.java:1036) at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:347) at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.serviceInit(ActiveStandbyElectorBasedElectorService.java:110) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:326) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1420) 2017-12-13 18:21:22,545 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2017-12-13 18:21:22,545 INFO org.apache.hadoop.service.AbstractService: Service ResourceManager failed in state INITED; cause: java.lang.NullPointerException