Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.4.1
-
None
-
None
Description
Hi,
This ticket is meant to understand the work that would be involved in porting the k8s PVC reuse feature onto the spark standalone cluster manager which reuses the shuffle files present locally in the disk
We are a heavy user of spot instances and we suffer from spot terminations impacting our long running jobs
The logic in `KubernetesLocalDiskShuffleExecutorComponents` itself is not that much. However when I tried this on the `LocalDiskShuffleExecutorComponents` it was not a successful experiment which suggests there is more to recovering shuffle files
I'd like to understand what will be the work involved for this. We'll be more than happy to contribute