[SPARK-31444] Pyspark memory and cores calculation doesn't account for task cpus - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Incomplete
Affects Version/s: 2.4.5
Fix Version/s: None
Component/s: PySpark
Labels:
- bulk-closed

Description

during changes for stage level scheduling, I discovered a possible issue that the calculation for splitting pyspark memory up doesn't take into account the spark.task.cpus setting.

Discussion here: https://github.com/apache/spark/pull/28085#discussion_r407573038

See PythonRunner.scala:

https://github.com/apache/spark/blob/6b88d136deb99afd9363b208fd6fe5684fe8c3b8/core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala#L90

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Thomas Graves

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 14/Apr/20 14:00

Updated:: 25/May/21 01:51

Resolved:: 25/May/21 01:41