Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-4172

Let tasks be killed after too many overall attempts

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.10.0, 0.9.3
    • None
    • None

    Description

      Currently, TaskImpl doesn't consider failing a task if there are too many overall attempts. In case of LLAP, the number of preempted task attempts -> overall task attempts can grow in a linkedhashmap.
      In an edge case, where an upstream application (Hive LLAP) cannot cope with a problematic query, this can also lead to OOM in the AM, due the very high number of TaskAttemptImpl objects.
      It would be beneficial to have the chance to limit the overall number of task attempts, regardless of they have been failed or killed.

      Attachments

        1. TEZ-4172.02.patch
          15 kB
          László Bodor
        2. TEZ-4172.01.patch
          15 kB
          László Bodor

        Activity

          People

            abstractdog László Bodor
            abstractdog László Bodor
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: