Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20165

Enable ZLIB for streaming ingest

    XMLWordPrintableJSON

Details

    Description

      Per gopalv's recommendation tried running streaming ingest with and without zlib. Following are the numbers

       
      Compression: NONE
      Total rows committed: 93800000
      Throughput: 1563333 rows/second
      $ hdfs dfs -du -s -h /apps/hive/warehouse/prasanth.db/culvert
      14.1 G  /apps/hive/warehouse/prasanth.db/culvert
       
      Compression: ZLIB
      Total rows committed: 92100000
      Throughput: 1535000 rows/second
      $ hdfs dfs -du -s -h /apps/hive/warehouse/prasanth.db/culvert
      7.4 G  /apps/hive/warehouse/prasanth.db/culvert
       
      ZLIB is getting us 2x compression and only 2% lesser throughput. We should enable ZLIB by default for streaming ingest. 

      Attachments

        1. HIVE-20165.1.patch
          2 kB
          Prasanth Jayachandran
        2. HIVE-20165.2.patch
          4 kB
          Prasanth Jayachandran
        3. HIVE-20165.3.patch
          4 kB
          Prasanth Jayachandran
        4. HIVE-20165.1.branch-3.patch
          4 kB
          Gopal Vijayaraghavan

        Activity

          People

            prasanth_j Prasanth Jayachandran
            prasanth_j Prasanth Jayachandran
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: