Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
For a cluster with very high load, it is quite common to see flush/compaction happening every minute on each RegionServer. And we have quite high chances of multiple regions going through splitting/merging.
RegionMover, while unloading all regions (graceful stop), writes down all regions to a local file and while loading them back (graceful start), ensures to bring every single region back from other RSs. While loading regions back, even if a single region can't be moved back, RegionMover considers load() failure. We miss out on possibilities of some regions going through split/merge process and the fact that not all regions written to local file might even exist anymore. Hence, RegionMover should gracefully handle moving any unknown region without marking load() failed.
Attachments
Issue Links
- links to