Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41952

Upgrade Parquet to fix off-heap memory leaks in Zstd codec

    XMLWordPrintableJSON

Details

    Description

      Recently, native memory leak have been discovered in Parquet in conjunction of it using Zstd decompressor from luben/zstd-jni library (PARQUET-2160).

      This is very problematic to a point where we can't use Parquet w/ Zstd due to pervasive OOMs taking down our executors and disrupting our jobs.

      Luckily fix addressing this had already landed in Parquet:
      https://github.com/apache/parquet-mr/pull/982

       

      Now, we just need to

      1. Updated version of Parquet is released in a timely manner
      2. Spark is upgraded onto this new version in the upcoming release

       

      Attachments

        Activity

          People

            chengpan Cheng Pan
            alexey.kudinkin Alexey Kudinkin
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: