Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-25628

Avoid unnecessary file ops if Iceberg table is LLAP cached

    XMLWordPrintableJSON

Details

    Description

      In case the query execution is vectorized for an Iceberg table, we need to make an extra file open operation on the ORC file to learn what the file schema is (to be matched later with the logical schema).

      In LLAP configuration the file schema could be retrieved through LLAP cache as ORC metadata is cached, so we should avoid the file operation when possible.

      Also: LLAP relies on cache keys that are usually triplets of file information and is constructed by an FS.listStatus call. For iceberg tables we should rely on such file information provided by Iceberg's metadata to spare this call too.

      Attachments

        Issue Links

          Activity

            People

              szita Ádám Szita
              szita Ádám Szita
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h