Details
-
Improvement
-
Status: Open
-
Normal
-
Resolution: Unresolved
-
None
-
None
Description
Schema mismatches detected through gossip will get resolved by calling MigrationManager.maybeScheduleSchemaPull. This method may decide to schedule execution of MigrationTask, but only after using a MIGRATION_DELAY_IN_MS = 60000 delay (for reasons unclear to me). Meanwhile, as long as the migration task hasn't been executed, we'll continue to have schema mismatches reported by gossip and will have corresponding maybeScheduleSchemaPull calls, which will schedule further tasks with the mentioned delay. Some local testing shows that dozens of tasks for the same endpoint will eventually be executed and causing the same, stormy behavior for this very endpoints.
My proposal would be to simply not schedule new tasks for the same endpoint, in case we still have pending tasks waiting for execution after MIGRATION_DELAY_IN_MS.
Attachments
Attachments
Issue Links
- is related to
-
CASSANDRA-11748 Schema version mismatch may leads to Casandra OOM at bootstrap during a rolling upgrade process
- Open