Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-15052

Reducing overseer bottlenecks using per-replica states

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 8.8
    • None
    • None

    Description

      This work has the same goal as SOLR-13951, that is to reduce overseer bottlenecks by avoiding replica state updates from going to the state.json via the overseer. However, the approach taken here is different from SOLR-13951 and hence this work supercedes that work.

      The design proposed is here: https://docs.google.com/document/d/1xdxpzUNmTZbk0vTMZqfen9R3ArdHokLITdiISBxCFUg/edit

      Briefly,

      1. Every replica's state will be in a separate znode nested under the state.json. It has the name that encodes the replica name, state, leadership status.
      2. An additional children watcher to be set on state.json for state changes.
      3. Upon a state change, a ZK multi-op to delete the previous znode and add a new znode with new state.

      Differences between this and SOLR-13951,

      1. In SOLR-13951, we planned to leverage shard terms for per shard states.
      2. As a consequence, the code changes required for SOLR-13951 were massive (we needed a shard state provider abstraction and introduce it everywhere in the codebase).
      3. This approach is a drastically simpler change and design.

      Credits for this design and the PR is due to Noble Paul. Mark Miller, Noble Paul and I have collaborated on this effort. The reference branch takes a conceptually similar (but not identical) approach.

      I shall attach a PR and performance benchmarks shortly.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            noble.paul Noble Paul
            ichattopadhyaya Ishan Chattopadhyaya
            Votes:
            2 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 10.5h
                10.5h

                Slack

                  Issue deployment