Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
2.4.5
-
None
Description
during changes for stage level scheduling, I discovered a possible issue that the calculation for splitting pyspark memory up doesn't take into account the spark.task.cpus setting.
Discussion here: https://github.com/apache/spark/pull/28085#discussion_r407573038
See PythonRunner.scala: