Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Done
-
None
-
None
-
None
Description
Currently most unit and integration tests verify the static behavior of autoscaling, ie. they introduce a change and then wait until the state of the system stabilizes, and then make assertions.
While this testing is necessary it doesn't address the issues related to transient responses of the framework to changes in time. It's also important to learn the timescale of the framework's reactions to such events, and its behavior when faced with overlapping / conflicting events.
One simple example that illustrates the need for this kind of testing is a scenario with a flaky node that periodically comes down and up again. Depending on the frequency of changes and the trigger configuration (eg. waitFor value) the current implementation of the framework may lead at best to high churn (as replicas are being moved around) or even to data loss (by moving replicas to a flaky node).
Attachments
1.
|
Test NodeLost / NodeAdded dynamics | Resolved | Unassigned | |
2.
|
Simulate a 1 bln docs scaling-up scenario | Closed | Andrzej Bialecki | |
3.
|
Implement maxOps limit for IndexSizeTrigger | Resolved | Andrzej Bialecki |