[SOLR-11726] Analyze and verify dynamic behavior of the autoscaling framework - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Resolved
Priority: Major
Resolution: Done
Affects Version/s: None
Fix Version/s: None
Component/s: AutoScaling
Labels:
None

Description

Currently most unit and integration tests verify the static behavior of autoscaling, ie. they introduce a change and then wait until the state of the system stabilizes, and then make assertions.

While this testing is necessary it doesn't address the issues related to transient responses of the framework to changes in time. It's also important to learn the timescale of the framework's reactions to such events, and its behavior when faced with overlapping / conflicting events.

One simple example that illustrates the need for this kind of testing is a scenario with a flaky node that periodically comes down and up again. Depending on the frequency of changes and the trigger configuration (eg. waitFor value) the current implementation of the framework may lead at best to high churn (as replicas are being moved around) or even to data loss (by moving replicas to a flaky node).

Attachments

Sub-Tasks

1.	Test NodeLost / NodeAdded dynamics	Resolved	Unassigned
2.	Simulate a 1 bln docs scaling-up scenario	Closed	Andrzej Bialecki
3.	Implement maxOps limit for IndexSizeTrigger	Resolved	Andrzej Bialecki

Activity

People

Assignee:: Unassigned

Reporter:: Andrzej Bialecki

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 05/Dec/17 20:18

Updated:: 14/Aug/21 01:37

Resolved:: 14/Aug/21 01:37