Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11972

Factor in row width during ProcessingCost calculation.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • Impala 4.3.0
    • Impala 4.4.0
    • Frontend
    • None

    Description

      IMPALA-11604 add ProcessingCost (PC) concept to measure the cost for a distinct PlanNode / DataSink / PlanFragment to process its input rows globally across all of its instances.

      We should investigate if the row width should be considered in computing PC for more operators, and if that will make the PC model more accurate. The code in IMPALA-11604 has materialization cost parameter to accommodate PC where row width should factor in. Currently, PC of ScanNode, ExchangeNode, and DataStreamSink has row width factored in through materialization parameter here.

      For VARCHAR, we can use some kind of average width stats, if available.  For fixed width columns, we just use the width. In both cases, the unit should be in bytes. The idea of including a width in costing is to make the outcome as precise and less error-prone as possible.

       

      Attachments

        Issue Links

          Activity

            People

              rizaon Riza Suminto
              rizaon Riza Suminto
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: