Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17754

HCat InputJobInfo in Pig UDFContext is heavyweight, and causes OOMs in Tez AMs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 2.2.0, 3.0.0
    • None
    • HCatalog
    • None

    Description

      HIVE-9845 dealt with reducing the size of HCat split-info, to improve job-launch times for Pig/HCat jobs.
      For large Pig queries that scan a large number of Hive partitions, it was found that the Pig UDFContext stored full-fat HCat InputJobInfo objects, thus blowing out the Pig Tez AM. Since this information is already stored in the HCatSplit, the serialization of InputJobInfo can be spared.

      Attachments

        1. HIVE-17754.1.patch
          66 kB
          Mithun Radhakrishnan
        2. HIVE-17754.2.patch
          67 kB
          Mithun Radhakrishnan
        3. HIVE-17754.2-branch-2.patch
          68 kB
          Mithun Radhakrishnan

        Issue Links

          Activity

            People

              mithun Mithun Radhakrishnan
              mithun Mithun Radhakrishnan
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: