Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15418

SparkSQL does not support using a UDAF in a CREATE VIEW clause

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 1.6.1
    • None
    • SQL

    Description

      I am using AWS EMR + Spark 1.6.1 + Hive 1.0.0

      I have this UDAF and have included it in the classpath of spark https://github.com/scribd/hive-udaf-maxrow/blob/master/src/com/scribd/hive/udaf/GenericUDAFMaxRow.java

      And registered it in spark by sqlContext.sql("CREATE TEMPORARY FUNCTION maxrow AS 'some.cool.package.hive.udf.GenericUDAFMaxRow'")

      However, when I call it in Spark in the following CREATE VIEW query

      CREATE VIEW VIEW_1 AS
            SELECT
              a.A,
              a.B,
              maxrow ( a.C,
                       a.D,
                       a.E,
                       a.F,
                       a.G,
                       a.H,
                       a.I
                  ) as m
              FROM
                  table_1 a
              JOIN
                  table_2 b
              ON
                      b.Z = a.D
                  AND b.Y  = a.C
              JOIN dummy_table
              GROUP BY
                  a.A,
                  a.B
      

      It gave me the following error

      16/05/18 19:49:14 WARN RowResolver: Duplicate column info for a.A was overwritten in RowResolver map: _col0: string by _col0: string
      16/05/18 19:49:14 WARN RowResolver: Duplicate column info for a.B was overwritten in RowResolver map: _col1: bigint by _col1: bigint
      16/05/18 19:49:14 ERROR Driver: FAILED: SemanticException [Error 10002]: Line 16:32 Invalid column reference 'C'
      org.apache.hadoop.hive.ql.parse.SemanticException: Line 16:32 Invalid column reference 'C'
                      at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10643)
                      at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10591)
                      at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3656)
      

      Running the query without CREATE VIEW is fine.

      Attachments

        Activity

          People

            Unassigned Unassigned
            hbwang Hanbo Wang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: