Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-17102

VersionBucket not needed

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • SolrCloud
    • None

    Description

      SolrCloud ensures that updates for the same document ID are done in the correct order internally in the face of possible re-orders during replication / log replay.  In order to ensure the updates are applied consecutively, a lock is held on a hash of the ID for the doc.  A hash is used to limit the number of total locks because the locks are pre-created in advance for the core (numVersionBuckets == 65k by default).  The memory is non-negligible with many cores, and it introduces the possibility of collisions, especially at lower bucket counts if you configure it much lower.

      Here I propose doing away with a pre-created hashed bucket strategy.  Instead, I propose more simply creating and GC'ing a lock per update being processed, and using a ConcurrentHashMap to hold those in-flight.  This strategy is already used in org.apache.solr.util.OrderedExecutor.SparseStripedLock, more or less.

      Doing this is more tractable now that VersionBucket only holds a lock, not a version anymore – SOLR-17036

      The biggest challenge is that the code calls for the ability to use a Condition to away/notify, which means the solution can't just re-use SparseStripedLock above nor be quite so simple.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dsmiley David Smiley
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: