Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
2.3.4
-
None
Description
https://issues.apache.org/jira/browse/HIVE-10329 helped in moving away from hadoop's ReflectionUtils constructor cache issue (https://issues.apache.org/jira/browse/HADOOP-10513).
However, there are corner cases where hadoop's ReflectionUtils is in use and this causes gradual build up of memory in HS2.
I have observed this in Hive 2.3. But the codepath in master for this has not changed much.
Easiest way to repro would be to add a temp function which extends GenericUDF. In FunctionRegistry::cloneGenericUDF, this would
end up using org.apache.hadoop.util.ReflectionUtils.newInstance which in turn lands up in COSNTRUCTOR_CACHE of ReflectionUtils.
CREATE TEMPORARY FUNCTION dummy AS 'com.hive.test.DummyGenericUDF' USING JAR 'file:///home/test/udf/dummy.jar'; select dummy(); at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:107) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.cloneGenericUDF(FunctionRegistry.java:1353) at org.apache.hadoop.hive.ql.exec.FunctionInfo.getGenericUDF(FunctionInfo.java:122) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:983) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) at org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
Note: Reflection based invocation of hadoop's ReflectionUtils::clear was removed in 2.x.