Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1.1, 2.0.2
    • 3.0.0-alpha-1, 2.2.0, 2.0.3, 2.1.2
    • None
    • None

    Description

      In the periodic regionServerReport from RS to master, we will call master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a same state with Master. If RS holds a region which master think should be on another RS, the Master will kill the RS.

      But, the regionServerReport could be lagging(due to network or something), which can't represent the current state of RegionServer. Besides, we will call reportRegionStateTransition and try forever until it successfully reported to master when online a region. We can count on reportRegionStateTransition calls.

      I have encountered cases that the regions are closed on the RS and reportRegionStateTransition to master successfully. But later, a lagging regionServerReport tells the master the region is online on the RS(Which is not at the moment, this call may generated some time ago and delayed by network somehow), the the master think the region should be on another RS, and kill the RS, which should not be.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            allan163 Allan Yang Assign to me
            allan163 Allan Yang
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment