Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44209

Expose amount of shuffle data available on the node

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Trivial
    • Resolution: Unresolved
    • 3.4.1
    • None
    • Shuffle

    Description

      ShuffleMetrics doesn't have metrics like 
      "totalShuffleDataBytes" and "numAppsWithShuffleData", these metrics are per node published by External Shuffle Service.
       
      Adding these metrics would help in -
      1. Deciding if we can decommission the node if no shuffle data present
      2. Better live monitoring of customer's workload to see if there is skewed shuffle data present on the node

      Attachments

        Activity

          People

            Unassigned Unassigned
            deependra Deependra Patel
            Abhishek Modi Abhishek Modi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: