Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-13232

Aggressively drop compression buffers in ORC OutStreams

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.3.0, 2.0.1, 2.1.0
    • ORC
    • None

    Description

      In Hive 0.11, when ORC's OutStream's were flushed they dropped all of the their buffers. In the patch for HIVE-4324, we inadvertently changed that behavior so that one of the buffers is held on to. For queries with a lot of writers and thus under significant memory pressure this can have a significant impact on the memory usage.

      Note that "hive.optimize.sort.dynamic.partition" avoids this problem by sorting on the dynamic partition key and thus only a single ORC writer is open at once. This will use memory more effectively and avoid creating ORC files with very small stripes, which will produce better downstream performance.

      Attachments

        1. HIVE-13232.patch
          3 kB
          Owen O'Malley
        2. HIVE-13232.patch
          0.7 kB
          Prasanth Jayachandran
        3. HIVE-13232.patch
          0.7 kB
          Owen O'Malley
        4. HIVE-13232-branch-1.patch
          2 kB
          Prasanth Jayachandran

        Issue Links

          Activity

            People

              omalley Owen O'Malley
              omalley Owen O'Malley
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: