Details
Description
On 2.x the ServerManager registers admins in a HashMap. This can result in thread safety issues — we recently observed an exception which caused a region to be indefinitely stuck in transition until we could manually intervene. We saw the following exception in the HMaster logs:
2023-10-11 02:20:05.213 [RSProcedureDispatcher-pool-325] ERROR org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher: Unexpected error caught, this may cause the procedure to hang forever java.lang.ClassCastException: class java.util.HashMap$Node cannot be cast to class java.util.HashMap$TreeNode (java.util.HashMap$Node and java.util.HashMap$TreeNode are in module java.base of loader 'bootstrap') at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1900) ~[?:?] at java.util.HashMap$TreeNode.treeify(HashMap.java:2016) ~[?:?] at java.util.HashMap.treeifyBin(HashMap.java:768) ~[?:?] at java.util.HashMap.putVal(HashMap.java:640) ~[?:?] at java.util.HashMap.put(HashMap.java:608) ~[?:?] at org.apache.hadoop.hbase.master.ServerManager.getRsAdmin(ServerManager.java:723)
Attachments
Issue Links
- links to