Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21396

Spark Hive Thriftserver doesn't return UDT field

    XMLWordPrintableJSON

Details

    Description

      I want to query a table with a MLLib Vector field and get below exception.
      Can Spark Hive Thriftserver be enhanced to return UDT field?

      ======
      2017-07-13 13:14:25,435 WARN [org.apache.hive.service.cli.thrift.ThriftCLIService] (HiveServer2-Handler-Pool: Thread-18537 Error fetching results:
      java.lang.RuntimeException: scala.MatchError: org.apache.spark.ml.linalg.VectorUDT@3bfc3ba7 (of class org.apache.spark.ml.linalg.VectorUDT)
      at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:83)
      at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
      at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
      at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
      at com.sun.proxy.$Proxy29.fetchResults(Unknown Source)
      at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:454)
      at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:621)
      at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553)
      at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538)
      at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
      at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
      at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
      at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      Caused by: scala.MatchError: org.apache.spark.ml.linalg.VectorUDT@3bfc3ba7 (of class org.apache.spark.ml.linalg.VectorUDT)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.addNonNullColumnValue(SparkExecuteStatementOperation.scala:80)
      at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.getNextRowSet(SparkExecuteStatementOperation.scala:144)
      at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:220)
      at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:685)
      at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:498)
      at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
      ... 18 more

      Attachments

        Activity

          People

            kentore82 Ken Tore Tallakstad
            Haopu Wang Haopu Wang
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: