Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-8341

Agent can become stuck in (re-)registering state during upgrades

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.5.0
    • None
    • None
    • 3

    Description

      Currently, an agent will not be erased from the set of currently (re-)registering agents if

      • it tries to (re-)register with a malformed version string
      • it tries to (re-)register with a version smaller than the minimum supported version
      • it tries to (re-)register with a domain when the master has no domain configured
      • the operator marks the slave as gone while the (re-)registration is ongoing

      Afterwards, all further (re-)registration attempts with the same agent id will be discarded, because the master still thinks that the original (re-)registration is ongoing.

      Since most realistic way to encounter this issue would be during cluster upgrades, and it will fix itself with a master restart, it is unlikely to be reported externally.

      Review: https://reviews.apache.org/r/64506

      Attachments

        Activity

          People

            bennoe Benno Evers
            bennoe Benno Evers
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: