Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
In Java implementation we encountered a case where initiating an RecordReader lead to application blocking due to concurrent requests.
I analysed thread-dumps for our application and in the very initial timeline of the issue - there were in total 9 threads doing some operation on FastReaderBuilder.
Category-1 : 6 threads - Blocked on object lock as reap is a synchronised operation
java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.avro.util.WeakIdentityHashMap.reap(WeakIdentityHashMap.java:161) - waiting to lock <0x00007f4f4a239ab0> (a org.apache.avro.util.WeakIdentityHashMap) at org.apache.avro.util.WeakIdentityHashMap.get(WeakIdentityHashMap.java:115) at java.util.Map.computeIfAbsent(java.base@11.0.18/Map.java:1001) at org.apache.avro.io.FastReaderBuilder.getRecordReaderFromCache(FastReaderBuilder.java:246)
Category-2: 2 threads - Runnable trying to insert entry into backingStore HashMap
java.lang.Thread.State: RUNNABLE at java.util.HashMap$TreeNode.balanceInsertion(java.base@11.0.18/HashMap.java:2304) at java.util.HashMap$TreeNode.treeify(java.base@11.0.18/HashMap.java:2010) at java.util.HashMap$TreeNode.split(java.base@11.0.18/HashMap.java:2245) at java.util.HashMap.resize(java.base@11.0.18/HashMap.java:710) at java.util.HashMap.putVal(java.base@11.0.18/HashMap.java:659) at java.util.HashMap.put(java.base@11.0.18/HashMap.java:608) at org.apache.avro.util.WeakIdentityHashMap.put(WeakIdentityHashMap.java:122) at java.util.Map.computeIfAbsent(java.base@11.0.18/Map.java:1004) at org.apache.avro.io.FastReaderBuilder.getRecordReaderFromCache(FastReaderBuilder.java:246)
Category-3: 1 thread - Runnable trying to remove an entry from backingStore HashMap
java.lang.Thread.State: RUNNABLE at java.util.HashMap$TreeNode.root(java.base@11.0.18/HashMap.java:1889) at java.util.HashMap$TreeNode.getTreeNode(java.base@11.0.18/HashMap.java:1954) at java.util.HashMap.removeNode(java.base@11.0.18/HashMap.java:820) at java.util.HashMap.remove(java.base@11.0.18/HashMap.java:795) at org.apache.avro.util.WeakIdentityHashMap.reap(WeakIdentityHashMap.java:165) - locked <0x00007f4f4a239ab0> (a org.apache.avro.util.WeakIdentityHashMap) at org.apache.avro.util.WeakIdentityHashMap.get(WeakIdentityHashMap.java:115) at java.util.Map.computeIfAbsent(java.base@11.0.18/Map.java:1001) at org.apache.avro.io.FastReaderBuilder.getRecordReaderFromCache(FastReaderBuilder.java:246)
Over time more and more thread started to enter in the BLOCKED state, but the three RUNNABLE threads remained constant.
I think the problem is backingStore being a non concurrent HashMap (a relevant piece on it: https://stackoverflow.com/questions/35534906/java-hashmap-getobject-infinite-loop)
Found a similar old issue report in GenericDatumReader: https://issues.apache.org/jira/browse/AVRO-3531
Attachments
Issue Links
- links to