Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34651 Improve ZSTD support
  3. SPARK-34954

Use zstd codec name in ORC file names

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 3.2.0
    • SQL
    • None

    Description

      Like the other ORC supported codecs, we had better have `zstd` in the Spark generated ORC file names.

      SNAPPY

      scala> spark.range(10).repartition(1).write.option("compression", "snappy").orc("/tmp/snappy")
      
      $ ls -al /tmp/snappy 
      total 24
      drwxr-xr-x   6 dongjoon  wheel  192 Apr  4 12:17 .
      drwxrwxrwt  14 root      wheel  448 Apr  4 12:17 ..
      -rw-r--r--   1 dongjoon  wheel    8 Apr  4 12:17 ._SUCCESS.crc
      -rw-r--r--   1 dongjoon  wheel   12 Apr  4 12:17 .part-00000-833bb7ad-d1e1-48cc-9719-07b2d594aa4c-c000.snappy.orc.crc
      -rw-r--r--   1 dongjoon  wheel    0 Apr  4 12:17 _SUCCESS
      -rw-r--r--   1 dongjoon  wheel  231 Apr  4 12:17 part-00000-833bb7ad-d1e1-48cc-9719-07b2d594aa4c-c000.snappy.orc
      

      ZSTD (AS-IS)

                                                                                      
      scala> spark.range(10).repartition(1).write.option("compression", "zstd").orc("/tmp/zstd")
      
      $ ls -al /tmp/zstd  
      total 24
      drwxr-xr-x   6 dongjoon  wheel  192 Apr  4 12:17 .
      drwxrwxrwt  14 root      wheel  448 Apr  4 12:17 ..
      -rw-r--r--   1 dongjoon  wheel    8 Apr  4 12:17 ._SUCCESS.crc
      -rw-r--r--   1 dongjoon  wheel   12 Apr  4 12:17 .part-00000-2f403ce9-7314-4db5-bca3-b1c1dd83335f-c000.orc.crc
      -rw-r--r--   1 dongjoon  wheel    0 Apr  4 12:17 _SUCCESS
      -rw-r--r--   1 dongjoon  wheel  231 Apr  4 12:17 part-00000-2f403ce9-7314-4db5-bca3-b1c1dd83335f-c000.orc
      

      Attachments

        Activity

          People

            dongjoon Dongjoon Hyun
            dongjoon Dongjoon Hyun
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: