Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8464

Async scheduling thread could be interrupted when there are no NodeManagers in cluster

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • 3.2.0, 3.1.1
    • capacity scheduler
    • None

    Description

      Test scenario:
      1. Make either yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs read-only
      2. Restart NMs via Ambari, none of them show up in the RM UI as expected
      3. Revert back the read-only dirs and restart NMs
      4. Include a non-existent dir in either yarn.nodemanager.log-dirs/yarn.nodemanager.local-dirs (1 good existing dir + 1 non-existing dir)
      5. Restart NMs via Ambari, all NMs show as RUNNING with a Health Report message as expected
      6. Submit a MapReduce sleep job, job goes into ACCEPTED state
      7. Job stays in ACCEPTED state forever even though all NMs are running and have available memory

       

      Credits to charanh who found this issue.

      Attachments

        1. YARN-8464.001.patch
          2 kB
          Sunil G
        2. YARN-8464.002.patch
          3 kB
          Sunil G

        Activity

          People

            sunilg Sunil G
            charanh Charan Hebri
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: