Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14800

Handle off by 3 in ORC split generation based on split strategy used

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      BI will apparently generate splits starting at offset 0.
      ETL will skip the ORC header and generate a split starting at offset 3.

      There's a workaround in the HiveSplitGenreator to handle this for consistent splits. Ideally, Orc split generation should take care of this.

      cc prasanth_j, gopalv

      Attachments

        Activity

          People

            Unassigned Unassigned
            sseth Siddharth Seth
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: