Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47017

Show metrics of the physical plan of RDDScanExec's internal RDD in the history server

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.4.0, 3.5.0
    • None
    • Web UI
    • None

    Description

      The RDDScanExec wraps an internal RDD (as below). In our environment, we find that this RDD is usually produced by some very large physical plans which contain quite a few physical nodes. Those nodes may have various metrics which are very useful for us to know what the execution looks like and any room for optimization, etc.

       

      case class RDDScanExec(
          output: Seq[Attribute],
          rdd: RDD[InternalRow],     <-- this field
          name: String, 

       

      However, the physical plan and the metrics are invisible from the SQL DAG in the Spark History Server. As it is an "existing RDD", the physical plan may be found from some previous SQL. The metrics are not visible from that previous SQL either. This is because the "definition" of these metrics are reported along with the SparkListenerSQLExecutionStart event of the "previous SQL" (where the physical plan of the RDDScanExec.rdd is in), but the metric values are reported from the SparkListenerTaskEnd event of the tasks which are attached to the SQL with RDDScanExec.

       

      Do we consider showing the physical plan and metrics of the RDDScanExec.rdd (the "Scan Existing RDD" node in the above DAG). For example, it may be shown as a "leg" (similar to but not the same as a child) in the DAG, or something else that may show the physical plan and metrics?

       

      Attachments

        1. simple2.scala
          9 kB
          Eric Yang
        2. ScanExistingRDD.jpg
          58 kB
          Eric Yang
        3. eventLogs-local-1708032228180.zip
          23 kB
          Eric Yang

        Activity

          People

            Unassigned Unassigned
            jiwen624 Eric Yang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: