Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-8298

[Direct Binary Access] Blobs that are directly uploaded are not tracked by BlobIdTracker

    XMLWordPrintableJSON

Details

    Description

      Blobs that are uploaded to the content repository are supposed to be tracked by the BlobIdTracker once the blob is saved.  This is done for blobs uploaded the traditional way in DataStoreBlobStore.writeBlob() [0].

      For blobs uploaded directly, they do not go through this method and so the blob ID is never added to the BlobIdTracker.  This has impact on DSGC as the MarkSweepGarbageCollector relies on the BlobIdTracker to provide an accurate accounting of the blob IDs in the blob store to determine which ones to retain and which to delete.

      This should be a pretty easy fix.  All direct uploads pass through DataStoreBlobStore.completeBlobUpload() [1], and we have at that point access to the DataRecord created by the direct upload.  We can get the blob ID from the DataRecord and add it to the BlobIdTracker there.

      [0]https://github.com/apache/jackrabbit-oak/blob/trunk/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/DataStoreBlobStore.java#L241

      [1] - https://github.com/apache/jackrabbit-oak/blob/46cde5ee49622c94bc95648edf84cf4c00ae1d58/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/DataStoreBlobStore.java#L723

      Attachments

        Activity

          People

            mattvryan Matt Ryan
            mattvryan Matt Ryan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: