Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-24961

[HBOSS] HBaseObjectStoreSemantics.close should call super.close to make sure its own instance always get removed from FileSystem.CACHE

    XMLWordPrintableJSON

Details

    Description

      This came up when running bulkloads on hbase deployments using HBOSS. The fixes introduced by HBASE-23679 use FileSystem.closeAllForUGI(ugi) to make sure FileSystem instances get cleared for the specific running UGI. Problem is that FileSystem.closeAllForUGI does not remove the instance from FileSystem.CACHE explicitly, it rather calls FileSystem.close, which in turn removes itself from FileSystem.CACHE. In this case, though, our FileSystem implementation is HBaseObjectStoreSemantics, so FileSystem.closeAllForUGI closes it, but does not remove it from FileSystem.CACHE, leading to all attempts to FileSystem.get by the same UGI retrieving a closed HBaseObjectStoreSemantics instance, ultimately failing as below:

       

      2020-08-26 12:43:57,528 ERROR org.apache.hadoop.hbase.regionserver.SecureBulkLoadManager: Failed to complete bulk load
      java.io.IOException: Exception while testing a lock
              at org.apache.hadoop.hbase.oss.sync.ZKTreeLockManager.isLocked(ZKTreeLockManager.java:312)
              at org.apache.hadoop.hbase.oss.sync.ZKTreeLockManager.writeLockAbove(ZKTreeLockManager.java:183)
              at org.apache.hadoop.hbase.oss.sync.TreeLockManager.treeReadLock(TreeLockManager.java:282)
              at org.apache.hadoop.hbase.oss.sync.TreeLockManager.lock(TreeLockManager.java:449)
              at org.apache.hadoop.hbase.oss.HBaseObjectStoreSemantics.exists(HBaseObjectStoreSemantics.java:498)
              at org.apache.hadoop.hbase.regionserver.SecureBulkLoadManager$1.run(SecureBulkLoadManager.java:281)
              at org.apache.hadoop.hbase.regionserver.SecureBulkLoadManager$1.run(SecureBulkLoadManager.java:266)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:360)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1856)
              at org.apache.hadoop.hbase.regionserver.SecureBulkLoadManager.secureBulkLoadHFiles(SecureBulkLoadManager.java:266)
              at org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:2445)
              at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42280)
              at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:418)
              at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
              at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
              at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
      Caused by: java.lang.IllegalStateException: Expected state [STARTED] was [STOPPED] 

      Attachments

        Issue Links

          Activity

            People

              wchevreuil Wellington Chevreuil
              wchevreuil Wellington Chevreuil
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: