Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.8.5
-
None
-
None
Description
Cluster:
1 Node with label "A"
yarn.scheduler.capacity.root.accessible-node-labels=*
yarn.resourcemanager.monitor.capacity.preemption.intra-queue-preemption.enabled=true
yarn.scheduler.capacity.root.default.minimum-user-limit-percent=50
User 1: Submit job Y require 10x cluster resources to queue, default, using label ""
User 2: (after job Y starts) submit job Z to queue, default using label ""
What we see: Job Z doesn't start until job Y releases resources. This happens because the pending requests for job Y and Z are in partition "". However, queue default is using resources in partition "A". Pending requests in partition "" don't cause intra queue preemptions in partition "A".