Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
-
None
-
None
Description
When a tablet server dies, Accumulo attempts to reassign the tablets it was hosting as quickly as possible to maintain availability. If multiple tablet servers die in quick succession, such as from a rolling restart of the Accumulo cluster or a network partition, this behavior can cause a storm of reassignment and rebalancing, placing significant load on the master.
To avert such load, Accumulo should be capable of maintaining a steady tablet assignment state in the face of transient tablet server loss. Instead of reassigning tablets as quickly as possible, Accumulo should be await the return of a temporarily downed tablet server (for some configurable duration) before assigning its tablets to other tablet servers.
Attachments
Issue Links
- breaks
-
ACCUMULO-4381 Use of JDK8 Process.isAlive() method in SuspendedTabletsIT
- Resolved
- links to