Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-955

Tez should close inputs after calling processor's close

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.4.0
    • None
    • None

    Description

      Hive flushes the processor pipeline in the close method. That might require reading additional input. Apparently the inputs are already closed in that case - which leads to a race condition where sometimes the reducer just hangs (there should also be an exception when read is called on a closed input).

      This is the stack trace you'll see when that happens:

      Thread 30938: (state = BLOCKED)
      sun.misc.Unsafe.park(boolean, long) @bci=0 (Interpreted frame)
      java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Interpreted frame)
      java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=2043 (Interpreted frame)
      java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=442 (Interpreted frame)
      org.apache.tez.runtime.library.shuffle.common.impl.ShuffleManager.getNextInput() @bci=67, line=608 (Interpreted frame)
      org.apache.tez.runtime.library.common.readers.ShuffledUnorderedKVReader.moveToNextInput() @bci=26, line=172 (Interpreted frame)
      org.apache.tez.runtime.library.common.readers.ShuffledUnorderedKVReader.next() @bci=30, line=113 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainer[], org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe[]) @bci=158, line=99 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable() @bci=78, line=150 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(java.lang.Object, int) @bci=12, line=197 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.Operator.forward(java.lang.Object, org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) @bci=63, line=791 (Compiled frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(java.lang.Object, org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) @bci=3, line=638 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject() @bci=109, line=670 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject() @bci=455, line=754 (Compiled frame)
      org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(java.lang.Object, int) @bci=251, line=229 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.Operator.forward(java.lang.Object, org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) @bci=63, line=791 (Compiled frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(java.lang.Object, org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) @bci=3, line=638 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject() @bci=109, line=670 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject() @bci=455, line=754 (Compiled frame)
      org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(java.lang.Object, int) @bci=251, line=229 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.Operator.forward(java.lang.Object, org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) @bci=63, line=791 (Compiled frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(java.lang.Object, org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) @bci=3, line=638 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject() @bci=109, line=670 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject() @bci=455, line=754 (Compiled frame)
      org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(java.lang.Object, int) @bci=251, line=229 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.Operator.forward(java.lang.Object, org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) @bci=63, line=791 (Compiled frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(java.lang.Object, org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) @bci=3, line=638 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject() @bci=109, line=670 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject() @bci=455, line=754 (Compiled frame)
      org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(java.lang.Object, int) @bci=251, line=229 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.Operator.forward(java.lang.Object, org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) @bci=63, line=791 (Compiled frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(java.lang.Object, org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) @bci=3, line=638 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject() @bci=109, line=670 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject() @bci=455, line=754 (Compiled frame)
      org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(java.lang.Object, int) @bci=251, line=229 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.Operator.forward(java.lang.Object, org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) @bci=63, line=791 (Compiled frame)
      org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(java.lang.Object, int) @bci=121, line=87 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.Operator.forward(java.lang.Object, org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector) @bci=63, line=791 (Compiled frame)
      org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(java.lang.Object[], org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator$AggregationBuffer[]) @bci=97, line=1064 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.GroupByOperator.flush() @bci=143, line=1089 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(boolean) @bci=125, line=1138 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.Operator.close(boolean) @bci=60, line=575 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close() @bci=65, line=348 (Interpreted frame)
      org.apache.hadoop.hive.ql.exec.tez.TezProcessor.close() @bci=11, line=74 (Interpreted frame)
      org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close() @bci=130, line=325 (Interpreted frame)
      org.apache.hadoop.mapred.YarnTezDagChild$4.run() @bci=112, line=529 (Interpreted frame)
      java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, java.security.AccessControlContext) @bci=0 (Interpreted frame)
      javax.security.auth.Subject.doAs(javax.security.auth.Subject, java.security.PrivilegedExceptionAction) @bci=42, line=415 (Interpreted frame)
      org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) @bci=14, line=1548 (Interpreted frame)
      org.apache.hadoop.mapred.YarnTezDagChild.main(java.lang.String[]) @bci=1139, line=515 (Interpreted frame)

      Attachments

        1. TEZ-955.1.patch
          1 kB
          Hitesh Shah

        Issue Links

          Activity

            People

              hitesh Hitesh Shah
              hagleitn Gunther Hagleitner
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: