Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-19588

Several invocation of file listing when creating VectorizedOrcAcidRowBatchReader

    XMLWordPrintableJSON

Details

    Description

      Looks like we are doing file listing several times when creating one instance of VectorizedOrcAcidRowBatchReader
      AcidUtils.parseBaseOrDeltaBucketFilename() does full file listing (when there are files with bucket_* prefix) just to get a single file out of a path to figure out if it has ACID schema (as part of HIVE-18190).
      There is full file listing where we populate
      1) ColumnizedDeleteEventRegistry
      2) SortMergedDeleteEventRegistry
      3) Twice in computeOffsetAndBucket()

       

      Attaching profiles which gopalv took while debugging. 

      Attachments

        1. HIVE-19588.4.patch
          20 kB
          Prasanth Jayachandran
        2. HIVE-19588.3.patch
          20 kB
          Prasanth Jayachandran
        3. HIVE-19588.2.patch
          20 kB
          Prasanth Jayachandran
        4. HIVE-19588.1.patch
          16 kB
          Prasanth Jayachandran
        5. Screen Shot 2018-05-16 at 2.23.25 PM.png
          151 kB
          Prasanth Jayachandran

        Issue Links

          Activity

            People

              prasanth_j Prasanth Jayachandran
              ndembla Nita Dembla
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: