Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16321

Possible deadlock in metastore with Acid enabled

    XMLWordPrintableJSON

Details

    Description

      TxnStore.MutexAPI is a mechanism how different Metastore instances can coordinate their operations. It uses a JDBCConnection to achieve it.

      In some cases this may lead to deadlock. TxnHandler uses a connection pool of fixed size. Suppose you have X simultaneous calls to TxnHandler.lock(), where X is >= size of the pool. This take all connections form the pool, so when

      handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
      

      is executed in TxnHandler.checkLock(Connection dbConn, long extLockId) the pool is empty and the system is deadlocked.

      MutexAPI can't use the same connection as the operation it's protecting. (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example).

      We could make MutexAPI use a separate connection pool (size > 'primary' conn pool).

      Or we could make TxnHandler.lock(LockRequest rqst) return immediately after enqueueing the lock with the expectation that the caller will always follow up with a call to checkLock(CheckLockRequest rqst).

      cc f1sherox

      Attachments

        1. HIVE-16321.01.branch-2.patch
          10 kB
          Eugene Koifman
        2. HIVE-16321.01.patch
          9 kB
          Eugene Koifman
        3. HIVE-16321.01-branch-2.3.patch
          10 kB
          Eugene Koifman
        4. HIVE-16321.02.branch-2.patch
          10 kB
          Eugene Koifman
        5. HIVE-16321.02.patch
          9 kB
          Eugene Koifman
        6. HIVE-16321.03.patch
          9 kB
          Eugene Koifman
        7. HIVE-16321.03-branch-2.patch
          10 kB
          Eugene Koifman
        8. HIVE-16321.05-branch-2.3.patch
          10 kB
          Eugene Koifman

        Issue Links

          Activity

            People

              ekoifman Eugene Koifman
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: