Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-10158

Mesos Agent gets stuck in Draining due to pending unacknowledged status updates

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • master
    • None

    Description

      A Mesos agent can get stuck in the Draining mode caused by pending unacknowledged status updates. When the framework becomes disconnected, the agent keeps sending task status updates for terminated tasks of that framework. This leads to a problem when the agent gets stuck in the Draining state because the master transitions the agent from DRAINING to DRAINED state only after all task status updates get acknowledged.

      This problem can be resolved by sending "Teardown" operation for all lost frameworks. However, it would be much better if this situation could be handled automatically by the Master. At least, we should make it easier for an operator to find out what prevents draining operation to complete.

      Attachments

        Activity

          People

            Unassigned Unassigned
            abudnik Andrei Budnik
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: