Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9951

Skew in analytic sorts when partition key has low cardinality

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Frontend

    Description

      In queries like TPC-DS Q67, the cardinality of the PARTITION BY expression of the analytic may be much lower than the parallelism of the input fragment. In this case the runtime of the sort can be skewed. We could mitigate the problem by doing the expensive sort before the exchange, so that the analytic fragment only needs to merge together its sorted input and evaluate the analytic over it.

      The impact of this is greater with multithreading, so I am considering only change the default when mt_dop > 0

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tarmstrong Tim Armstrong
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: