Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
7.5
-
None
-
None
Description
While writing SOLR-12057, introduced a test CdcrReplicaTypesTest where;
If we enable CDCR to start at the source and at the same time index docs;
//start CDCR CdcrTestsUtil.cdcrStart(cluster1SolrClient); // ADD operation on cluster 1 int batchSize = (TEST_NIGHTLY ? 100 : 10); int numDocs_c1 = 0; for (int k = 0; k < batchSize; k++) { req = new UpdateRequest(); for (; numDocs_c1 < (k + 1) * 100; numDocs_c1++) { SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", "cluster1_" + numDocs_c1); doc.addField("xyz", numDocs_c1); req.add(doc); } req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true); log.info("Adding " + batchSize + " docs with commit=true, numDocs=" + numDocs_c1); req.process(cluster1SolrClient); }
there is a race condition/code synchronization gap; where the BOOTSTRAP (the initial CDCR synchronization process) doesn't copy anything (index files) to target cluster (as the docs are still being written) and once the normal replication begins, the first few batches are missed and replication begins from later batches.
reproduce with: ant test -Dtestcase=CdcrReplicaTypesTest -Dtests.method=testTlogReplica -Dtests.seed=2D7490F137D61C8 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=lt -Dtests.timezone=America/Jamaica -Dtests.asserts=true -Dtests.file.encoding=UTF-8 [beaster] [17:34:45.013] FAILURE 143s | CdcrReplicaTypesTest.testTlogReplica <<< [beaster] > Throwable #1: java.lang.AssertionError: cluster 1 docs mismatch expected:<2000> but was:<1900>