Spark Task scheduling has long time consider CPU only, depending on how many vcores each executor has at given moment, the task were scheduled as long as enough vcores become available.
Moving to deep learning use cases, The fundamental computation and processing unit switched from CPU to GPU/FPGA + CPU which moves data in and out of GPU memory.
Deep learning framework build on top of GPU fleets requires fixture of task to number of GPUs spark haven't support yet. E.g a horord task requires 2 GPUs running uninterrupted before it finish regardless how CPU availability in executor. In Uber peloton executor scheduler, the number of cores available could be more than what user asked due to the fact it might get over provisioned.
Without definitive occupy of pci device(/gpu1, /gpu2), such workload may run into unexpected states.
Existing SPIP: Accelerator Aware Task Scheduling For Spark SPARK-24615, compatible with design, approach is a bit different as it tacks utilization of pci devices where customized taskscheduler could either fallback to "best to have" approach or implement "must have" approach stated above.