[SPARK-35046] Wrong memory allocation on standalone mode cluster - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Invalid
Affects Version/s: 3.0.1
Fix Version/s: None
Component/s: Scheduler, Spark Core
Labels:
None

Description

I see a bug in executer memory allocation in the standalone cluster, but I can't find which part of the spark code causes this problem. That why's I decided to raise this issue here.

Assume you have 3 workers with 10 CPU cores and 10 Gigabyte memories. Assume also you have 2 spark jobs that run on this cluster of workers, and these jobs configs set as below:

-----------------

job-1:

executer-memory: 5g

executer-CPU: 4

max-cores: 8

------------------

job-2:

executer-memory: 6g

executer-CPU: 4

max-cores: 8

------------------

In this situation, We expect that if we submit both of these jobs, the first job that submits get 2 executers which each of them has 4 CPU core and 5g memory, and the second job gets only one executer on thirds worker who has 4 CPU core and 6g memory because worker 1 and worker 2 doesn't have enough memory to accept the second job. But surprisingly, we see that one of the first or second workers creates an executor for job-2, and the worker's consuming memory goes beyond what's allocated to that and gets 11g memory from the operating system.

Is this behavior normal? I think this can cause some undefined behavior problem in the cluster.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Mohamadreza Rostami

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 13/Apr/21 07:34

Updated:: 12/Dec/22 18:10

Resolved:: 23/Apr/21 18:21