Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2017

Lazy materialization of Parquet columns during query

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • Impala 1.4, Impala 2.0, Impala 2.1, Impala 2.2
    • None
    • Backend

    Description

      When I run a query over a 4 billion row table that returns a single row, it takes ~30 seconds if i do 'select * ...'. It takes only 3 seconds if I do a 'select field1, field2 ...'. This is repeatable.

      Given these times, it would seem that the 'select *' query is materializing all the fields for rows whether they match or not.

      Lazy materialization of columns when they are needed could improve performance.

      These four queries were run back to back. The actual returned data is elided (sorry). The table has 35 fields.

      0: jdbc:hive2://atl1c1r2data09.vldb-bo.secure> select * from events where event_id=1416403791; 
      <elided>
      1 row selected (33.777 seconds)
      0: jdbc:hive2://atl1c1r2data09.vldb-bo.secure> select event_id, client_id from events where event_id=1416403791;
      +-------------+------------+--+
      | event_id | client_id |
      +-------------+------------+--+
      | 1416403791 | <elided> |
      +-------------+------------+--+
      1 row selected (3.363 seconds)
      0: jdbc:hive2://atl1c1r2data09.vldb-bo.secure> select * from events where event_id=1416403791; 
      <elided>
      1 row selected (33.138 seconds)
      0: jdbc:hive2://atl1c1r2data09.vldb-bo.secure> select event_id, client_id from events where event_id=1416403791;
      +-------------+------------+--+
      | event_id | client_id |
      +-------------+------------+--+
      | 1416403791 | <elided> |
      +-------------+------------+--+
      1 row selected (3.074 seconds)
      0: jdbc:hive2://atl1c1r2data09.vldb-bo.secure>
      

      Attachments

        Issue Links

          Activity

            People

              arawat Abhishek Rawat
              lbershad_impala_629c Lou Bershad
              Votes:
              3 Vote for this issue
              Watchers:
              23 Start watching this issue

              Dates

                Created:
                Updated: