Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20441

NPE in GenericUDF when hive.allow.udf.load.on.demand is set to true

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      When hive.allow.udf.load.on.demand is set to true and hiveserver2 has been started, the new created function from other clients or hiveserver2 will be loaded from the metastore at the first time.

      When the udf is used in where clause, we got a NPE like:

      Error executing statement:
      org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: NullPointerException null
              at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:290) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hive.service.cli.operation.Operation.run(Operation.java:320) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAP
      SHOT]
              at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHO
      T]
              at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:542) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA
      PSHOT]
              at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA
      PSHOT]
              at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:57) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_77]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_77]
              at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
      Caused by: java.lang.NullPointerException
              at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:236) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1104) ~[hive-exec-2.
      3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.
      3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:229) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:176) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11613) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11568) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11536) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3303) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3283) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9592) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10549) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10427) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:11125) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11138) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10807) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1295) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
              at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:204) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
      

      The code to get udf from metastore is:

      private FunctionInfo getFunctionInfoFromMetastoreNoLock(String functionName, HiveConf conf) {
          try {
            String[] parts = FunctionUtils.getQualifiedFunctionNameParts(functionName);
            Function func = Hive.get(conf).getFunction(parts[0].toLowerCase(), parts[1]);
            if (func == null) {
              return null;
            }
            // Found UDF in metastore - now add it to the function registry.
            FunctionInfo fi = registerPermanentFunction(functionName, func.getClassName(), true,
                FunctionTask.toFunctionResource(func.getResourceUris()));
            if (fi == null) {
              LOG.error(func.getClassName() + " is not a valid UDF class and was not registered");
              return null;
            }
            return fi;
          } catch (Throwable e) {
            LOG.info("Unable to look up " + functionName + " in metastore", e);
          }
          return null;
        }
      

      After getting the function, the function is registered to permanent function list through method 'registerPermanentFunction'.

      public FunctionInfo registerPermanentFunction(String functionName,
            String className, boolean registerToSession, FunctionResource... resources) {
          FunctionInfo function = new FunctionInfo(functionName, className, resources);
          // register to session first for backward compatibility
          if (registerToSession) {
            String qualifiedName = FunctionUtils.qualifyFunctionName(
                functionName, SessionState.get().getCurrentDatabase().toLowerCase());
            if (registerToSessionRegistry(qualifiedName, function) != null) {
              addFunction(functionName, function);
              return function;
            }
          } else {
              addFunction(functionName, function);
          }
          return null;
        }
      

      And the variable registerToSession is true, so the object 'function' will be returned. But the genericUDF field of the returned function is null which cause the error.

      We should return the result of the method registerToSessionRegistry returned.

      Attachments

        1. HIVE-20441.patch
          0.9 kB
          Hui Huang
        2. HIVE-20441.1.patch
          2 kB
          Hui Huang
        3. HIVE-20441.2.patch
          4 kB
          Hui Huang
        4. HIVE-20441.3.patch
          4 kB
          Hui Huang
        5. HIVE-20441.4.patch
          4 kB
          Hui Huang

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            dengzh Zhihua Deng Assign to me
            BIGrey Hui Huang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 2h 20m
              2h 20m

              Slack

                Issue deployment