Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-9460

Speculative operations may make master and allocator resource views out of sync.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Accepted
    • Major
    • Resolution: Unresolved
    • 1.5.1, 1.6.1, 1.7.0
    • None
    • agent, master
    • Mesos Foundations R8 Sprint 35, Mesos Foundations R9 Sprint 36, Mesos Foundations R9 Sprint 37, Mesos Foundations R10 Sp 38, Mesos Foundations RI11 Sp 40, Mesos Foundations RI11 Sp 41
    • 5

    Description

      When speculative operations (RESERVE, UNRESERVE, CREATE, DESTROY) are issued via the master operator API, the master updates the allocator state in Master::apply(), and then later updates its internal state in Master::_apply. This means that other updates to the allocator may be interleaved between these two continuations, causing the master state to be out of sync with the allocator state.

      This bug could happen with the following sequence of events:

      This caused MESOS-7971 and likely MESOS-9458 as well.

      It's unclear how this can be fixed in a reliable way. It's possible that ensuring that updates to the allocator state and the master state are performed in a single synchronous block of code could work, but in the case of operator-initiated operations this is difficult. It may also be possible to ensure consistency by ensuring that every time such updates are done in the master, the allocator is updated before the master state.

      This ticket will be Done when a comprehensive solution for this issue is designed. A subsequent ticket for actual implementation of that solution should be filed.

      Attachments

        Issue Links

          Activity

            People

              greggomann Greg Mann
              mzhu Meng Zhu
              Gastón Kleiman Gastón Kleiman
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: