Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44526

Porting k8s PVC reuse logic to spark standalone

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.4.1
    • None
    • Shuffle, Spark Core
    • None

    Description

      Hi,

      This ticket is meant to understand the work that would be involved in porting the k8s PVC reuse feature onto the spark standalone cluster manager which reuses the shuffle files present locally in the disk

      We are a heavy user of spot instances and we suffer from spot terminations impacting our long running jobs

      The logic in `KubernetesLocalDiskShuffleExecutorComponents` itself is not that much. However when I tried this on the `LocalDiskShuffleExecutorComponents` it was not a successful experiment which suggests there is more to recovering shuffle files

      I'd like to understand what will be the work involved for this. We'll be more than happy to contribute

      Attachments

        Activity

          People

            Unassigned Unassigned
            haldefaiz Faiz Halde
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: