Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.1
-
None
-
None
Description
Preconditions:
- Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
- Set the below parameters in RM yarn-site.xml ::<property>
<name>yarn.resourcemanager.opportunistic-container-allocation.enabled</name>
<value>true</value>
</property> - Set this in NM[s]yarn-site.xml ::: <property>
<name>yarn.nodemanager.opportunistic-containers-max-queue-length</name>
<value>30</value>
</property>
Test Steps:
Job Command : :
Job Command :: Job Command : : yarn org.apache.hadoop.yarn.applications.distributedshell.Client jar HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar -shell_command sleep -shell_args 20 -num_containers 20 -container_type OPPORTUNISTIC -promote_opportunistic_after_start
Actual Result: Distributed Shell Yarn Job Failed almost all times with below Diagnostics message
[ Failed Reason : Application Failure: desired = 10, completed = 10, allocated = 10, failed = 2, diagnostics = [2021-02-10 00:00:27.640]Container Killed to make room for Guaranteed Container.]
Expected Result: DS job should be successful with argument "promote_opportunistic_after_start" * *