Details
-
Task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Consider the following series of events (in the context of late-writes / OAK-10254) :
- a tree structure /a/b/c/d is created properly
- the subtree /a/b is half-removed in a late-write - i.e. it is not properly removed
- (at this point this late-write removal would still be detected)
- then a different cluster instance starts up, reusing the clusterId from the above crashed instance
- that cluster instance then does a sweep - thereby implicitly marking those late-writes as committed (commit value then resolves to "c" implicitly as they are older than sweepRev) - so /a/b is now actually deleted. IOW the late-write revision that was never merged, now all of a sudden is considered merged)
- the side-effect of this (sweep) situation is that the state of a revision all of a sudden has changed. But unless the nodesCache is properly invalidated, it would now contain the wrong state : that eg /a/b/c/d exists even though it now no longer does.
- next that cluster instance does a classic GC - this will delete the /a/b subtree documents (but fails to invalidate that subtree in caches properly)
- now that cluster instance then tries to create /a/b/c/d/e
- this attempt fails with a ConflictException since part of the code now expects /a/b/c/d to exist (the nodesCache) - but another part says it doesn't exist (documentStore) - hence "The node 4:/a/b/c/d does not exist or is already deleted at base revision"