Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4163

Introduce SORTBY plan hint for insert statements

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • Impala 2.2, Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, Impala 2.6.0, Impala 2.7.0
    • None
    • Frontend

    Description

      In order to improve compression and/or the effectiveness of min/max pruning, it is desirable to control the order in which rows are inserted into table (mostly for Parquet).

      To that end, we should introduce a "sortby" plan hint for insert statements: Example

      CREATE TABLE dst (...);
      INSERT INTO dst /*+ sortby(day,hour) */ SELECT * FROM src;
      

      This would produce the following plan:
      SCAN -> SORT(day,hour) -> TABLE SINK

      Syntax and behavior

       INSERT INTO dst /*+ sortby(day,hour) */ SELECT * FROM src; 
      • We will not support the legacy-hint style with brackets
        [sortby(day,hour)]
      • To keep the "clustered" hint strictly separate from the "sortby" hint, it is only legal to use non-partition columns in "sortby" for HDFS tables.
      • Similarly, it is only legal to mention non-primary-key columns of Kudu tables.

      Attachments

        Issue Links

          Activity

            People

              lv Lars Volker
              alex.behm Alexander Behm
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: