Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
main (10.0)
-
None
Description
While working on SOLR-14763, I found different behavior with LBHttp2SolrClient between branch_9x and main/10.x.
If the first Endpoint in the list had previously failed, branch_9x will skip the failed Endpoint with subsequent requests, and begin requesting with the second Endpoint. If all remaining Endpoints fail, it will then retry the first Endpoint again.
If the first Endpoint in the list had previously failed, main/10.x will always try the first Endpoint despite it being in the "Zombie List". When the first Endpoint fails again, it will re-try the second Endpoint.
The branch_9x behavior seems more desirable as this minimizes unnecessary work by avoiding Endpoints that are known to fail. Indeed, main/10.x has an obvious bug in EndpointIterator#fetchNext where it attempts to get the wrong type of key for the map holding the Zombies. I believe this difference is a regression bug in main/10x.
The different behavior is recorded in test LBHttp2SolrClientTest#testAsyncWithFailures. This test was added after-the-fact with SOLR-14763. I needed to change its "asserts" when backporting to branch_9x to account for the changed behavior.
Attachments
Issue Links
- is caused by
-
SOLR-17066 Deprecate and remove core URLs in HttpSolrClient and friends
- Resolved
- relates to
-
SOLR-14763 SolrJ Client Async HTTP/2 Requests
- Closed
- links to