Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Problem: Metadata are requested more than once for each blob
Setting a breakpoint in AbstractSharedCachingDataStore.getRecordIfStored() and logging the dataIdentifiers, we see that it calls backend.getRecord() 3 times for the same dataIdentifier when a replication package is installed by vault. The reason seems to be that during commits, every CommitHook runs its own compareAgainstBaseState and, because the implementation avoids fetching the blob if it only needs the metadata, the request to the existing blob cache is always a miss.
Proposed solution: Cache backend.getRecord() calls
Manual testing has shown that caching backend.getRecord() calls reduces the time spent in .getRecordIfStored() by between 12 and 35% when installing replication packages containing 500 paths.
The PR is at https://github.com/apache/jackrabbit-oak/pull/1155
Attachments
Attachments
Issue Links
- split from
-
OAK-10116 Performance problem when importing nodes with many binary properties and remote blobstore
- Open
- links to