Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-5527

Provide work conservation incentives for schedulers.

    XMLWordPrintableJSON

Details

    • Epic
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • allocation, framework
    • None
    • Work Conservation

    Description

      As we begin to add support for schedulers to revoke resources to obtain their quota or fair share, we need to consider the case of non-cooperative or malicious schedulers that cause excessive revocation either by accident or intentionally.

      For example, a malicious scheduler could keep a low allocation below its fair share, and revoke as many resources as it can in order to disturb existing work as much as possible.

      We can provide mitigation techniques, or incentives / penalties to schedulers that cause excessive revocation:

      • Disallow revocation when a scheduler resources are available. The scheduler must choose available resources or wait until allocated resources free up. This means picky schedulers may not obtain the resources they want.
      • Penalize schedulers causing excessive revocation in order to incentivize them to play nicely.
      • Use a degree of pessimism to restrict which resources a scheduler can revoke (e.g. only batch tasks that have not been running for a long time). If we augment task information to know whether it is a service or a batch job we may be able to do better here.
      • etc

      The techniques employed for work conservation in the presence of revocation should be configurable, and users should be able to achieve their own custom work conservation policies by implementing an allocator (or a subcomponent of the existing allocator).

      Attachments

        Activity

          People

            Unassigned Unassigned
            bmahler Benjamin Mahler
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: