Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6957

shuffle hangs after a node manager connection timeout

VotersStop watchingWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.9.0, 3.0.0-beta1, 2.7.5, 2.8.3
    • mrv2
    • None

    Description

      After a connection failure from the reducer to the node manager, shuffles started to hang with the following message:

      org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#1 - MergeManager returned status WAIT ...
      

      There are two problems that leads to the hang.

      Problem 1.
      When a reducer has an issue connecting to the node manager, copyFromHost may call putBackKnownMapOutput on the same task attempt multiple times.

      There are two call sites of putBackKnownMapOutput in copyFromHost since MAPREDUCE-6303:
      1. In the finally block of copyFromHost
      2. In the catch block of openShuffleUrl.

      When openShuffleUrl fails to connect from the catch block in copyFromHost, it returns null.
      By the time openShuffleUrl returns null, putBackKnownMapOutput would have been called already for all remaining map outputs.
      However, the finally block calls putBackKnownMapOutput one more time on the map outputs.

      Problem 2. Problem 1 causes a leak in MergeManager.
      The problem occurs when multiple fetchers get the same set of map attempt outputs to fetch.
      Different fetchers reserves memory from MergeManager in Fetcher.copyMapOutput for the same map outputs.
      When the fetch succeeds, only the first map output gets committed through ShuffleSchedulerImpl.copySucceeded -> InMemoryMapOutput.commit, because commit() is gated by !finishedMaps[mapIndex].
      This may lead to a condition where usedMemory > memoryLimit, while commitMemory < mergeThreshold.
      This gets the MergeManager into a deadlock where a merge is never triggered while MergeManager cannot reserve additional space for map outputs.

      Attachments

        1. MAPREDUCE-6957.001.patch
          3 kB
          Jooseong Kim
        2. MAPREDUCE-6957.002.patch
          9 kB
          Jooseong Kim
        3. MAPREDUCE-6957.003.patch
          9 kB
          Jooseong Kim

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jooseong Jooseong Kim
            jooseong Jooseong Kim
            Votes:
            0 Vote for this issue
            Watchers:
            6 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment