Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12861

File formats are confused when Iceberg tables has mixed formats

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Impala 4.3.0
    • None
    • Frontend
    • ghx-label-4

    Description

      Repro steps:
      create table mixed_ice (i int, year int) partitioned by spec (year) stored as iceberg tblproperties('format-version'='2');
       
      1) populate one partition with Impala (parquet)
      insert into mixed_ice values (1, 2024), (2, 2024);
       
      2) change the write format:
      alter table mixed_ice set tblproperties ('write.format.default'='orc');
       
      3) populate another partition with Hive (orc)
      insert into mixed_ice values (1, 2025), (2, 2025), (3, 2025);
       
      4) then query just the parquet partition:
      explain select * from mixed_ice where year = 2024;

      | F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1                                    |
      | Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB thread-reservation=1      |
      |   PLAN-ROOT SINK                                                                         |
      |   |  output exprs: default.mixed_ice.i, default.mixed_ice.year                           |
      |   |  mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0 |
      |   |                                                                                      |
      |   01:EXCHANGE [UNPARTITIONED]                                                            |
      |      mem-estimate=16.00KB mem-reservation=0B thread-reservation=0                        |
      |      tuple-ids=0 row-size=8B cardinality=2                                               |
      |      in pipelines: 00(GETNEXT)                                                           |
      |                                                                                          |
      | F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1                                           |
      | Per-Host Resources: mem-estimate=64.05MB mem-reservation=32.00KB thread-reservation=2    |
      |   DATASTREAM SINK [FRAGMENT=F01, EXCHANGE=01, UNPARTITIONED]                             |
      |   |  mem-estimate=48.00KB mem-reservation=0B thread-reservation=0                        |
      |   00:SCAN HDFS [default.mixed_ice, RANDOM]                                               |
      |      HDFS partitions=1/1 files=1 size=602B                                               |
      |      Iceberg snapshot id: 4964066258730898133                                            |
      |      skipped Iceberg predicates: `year` = CAST(2024 AS INT)                              |
      |      stored statistics:                                                                  |
      |        table: rows=5 size=945B                                                           |
      |        columns: unavailable                                                              |
      |      extrapolated-rows=disabled max-scan-range-rows=5                                    |
      |      file formats: [ORC, PARQUET]                                                        |
      |      mem-estimate=64.00MB mem-reservation=32.00KB thread-reservation=1                   |
      |      tuple-ids=0 row-size=8B cardinality=2                                               |
      |      in pipelines: 00(GETNEXT)                                                           |
      +------------------------------------------------------------------------------------------+ 

      Note, the file formats: [ORC, PARQUET] part even  though this query only reads a parquet files.
       
      Some analyis:
      When IcebergScanNode is created it holds the correct information about file formats (Parquet).

      Later on the parent class, HdfsScanNode also tries to populate the file formats here.]
       
      It uses what getSampledOrRawPartitions() returns. In this use case the 'sampledPartitions_' is null, so will return 'partitions_'
       
      Apparently, this 'partitions_' member holds the partition with the ORC file so it adds ORC to the fileFormats_. Unfortunately, this getSampledOrRawPartitions() is called in multiple locations within HdfsScanNode returning the wrong partition.

      Next steps:

      Check what other issues can this getSampledOrRawPartitions cause with multi file format tables. Also check if we can populate 'partitions_' properly.

      Attachments

        1. multi_file_table_crash
          862 kB
          Gabor Kaszab

        Activity

          People

            Unassigned Unassigned
            gaborkaszab Gabor Kaszab
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: