Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
None
-
Patch
Description
Summary
There are a couple of issues in HCatRecordObjectInspectorFactory[1] because it uses a static Java HashMap to cache objects:
- Java HashMap is not thread safe. This can lead to data corruptions and race conditions in multithreaded servers when two threads update the ObjectInspector.
- There is no eviction policy and as a result, this can result in memory leaks. If user reads a lot of different schemas, Hive server will start seeing memory pressure, once it start going to have a lot of cached record and object inspectors.
This patch propose to replace the cache using a Guava cache which enables cache evictions and thread safety. Guava cache is already used in Hive ObjectInspectorFactory [2], so this change is consistent with the rest of Hive.
Attached is a patch that fixes this issue.
References:
- https://github.com/apache/hive/blob/b58d50cb73a1f79a5d079e0a2c5ac33d2efc33a0/hcatalog/core/src/main/java/org/apache/hive/hcatalog/data/HCatRecordObjectInspectorFactory.java#L44-L47
- https://github.com/apache/hive/blob/b58d50cb73a1f79a5d079e0a2c5ac33d2efc33a0/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java#L68-L87
Review Board Link:
Attachments
Attachments
Issue Links
- is related to
-
SPARK-17398 Failed to query on external JSon Partitioned table
- Resolved
- links to