Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-10023

Allocator method dispatches can be reordered (relative to scheduler API calls which triggered them).

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.9.0
    • 1.10.0
    • None
    • Foundations: RI-20 59, Studio 4: RI-21 60, Studio 4: RI-21 61, Studio 4: RI-22 62, Studio 4: RI-23 64
    • 8

    Description

      Observed an example of such reordering on a testing cluster with a V1 framework.
      Framework side:

      • framework issues ACCEPT for a slave with no operations and a 365+ days filter
      • framework issues REVIVE call for all roles (which should clear all filters)
      • framework waits for an offer for that slave and never receives it

      Master side:

      • master receives ACCEPT, processes the first part and starts authorization
      • master receives REVIVE and dispatches reviveOffers() to the allocator
      • master receives a response from authorizer (for ACCEPT) and dispatches recoverResources() with a 365-day filter to the allocator

      We need to provide an ability for the framework to avoid such kind of reorderings.

      Things to consider:

      • v1 framework are not required to use a single connection for API requests; even if they were, there still is a reconnection case, during which the views of the framework and the master on the state of connection might differ. This means that we cannot completely avoid this problem by sequencing processing of requests from the same connection.
      • Currently, all calls directly influencing allocator (except for UPDATE_FRAMEWORK) return '202 ACCEPTED` at an early stage of processing. Unconditionally changing this might break compatibility with some existing frameworks.

      Attachments

        Issue Links

          Activity

            People

              asekretenko Andrei Sekretenko
              asekretenko Andrei Sekretenko
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: