Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24734

Sanity check in HiveSplitGenerator available slot calculation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 4.0.0
    • None
    • Tez
    • None

    Description

      HiveSplitGenerator calculates the number of available slots from available memory like this:

      if (getContext() != null) {
        totalResource = getContext().getTotalAvailableResource().getMemory();
        taskResource = getContext().getVertexTaskResource().getMemory();
        availableSlots = totalResource / taskResource;
      }
      

      I had a scenario where the total memory was calculated correctly, but the task memory returned -1. This led to error like these:

      tez.HiveSplitGenerator: Number of input splits: 1. -3641 available slots, 1.7 waves. Input format is: org.apache.hadoop.hive.ql.io.HiveInputFormat
      
      Estimated number of tasks: -6189 for bucket 1
      
      java.lang.IllegalArgumentException: Illegal Capacity: -6189
      

      Admittedly, this happened during development, and hopefully will not occur on a properly configured cluster. (Although I'm not sure what the issue was on my setup, possibly XMX set higher than physical memory.)

      In any case, it feels like setting availableSlots < 1 will never lead to desired behavior, so in such cases we could emit a warning and correct the value to 1.

      Attachments

        Activity

          People

            Unassigned Unassigned
            zmatyus Zoltan Matyus
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: