Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-8617

Add support for lz4 in parquet

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • Impala 3.3.0
    • Backend

    Description

      Hadoop uses a native block format for LZ4 (same as parquet-mr api) which is incompatible with LZ4 block format.

      As a result Parquet/LZ4 could have different block formats.

      The parquet-cpp api (now Apache Arrow) uses LZ4 frame format, which is also incompatible with LZ4 block format.

      The current decision is to use a format compatible with Hive, Spark and parquet-mr.

       

      Attachments

        Activity

          People

            arawat Abhishek Rawat
            arawat Abhishek Rawat
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: