Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12108

Add support for writing data with LZ4's high compression mode

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Impala 4.3.0
    • None
    • Backend
    • ghx-label-4

    Description

      LZ4 has a high compression mode that gets higher compression ratios than Snappy while maintaining high decompression speeds. The tradeoff is that compression is very slow. We should add support for writing data with LZ4 high compression mode. This would let us get a sense of the performance for writing and reading.

      See this benchmark on the LZ4 page:

      https://github.com/lz4/lz4#benchmarks

      In my hand tests, Parquet/LZ4 is about 13% smaller than Parquet/Snappy, but it retains the fast decompression.

      Attachments

        Activity

          People

            Unassigned Unassigned
            joemcdonnell Joe McDonnell
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: