Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-42401

Incorrect results or NPE when inserting null value into array using array_insert/array_append

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0, 3.5.0
    • 3.4.0
    • SQL

    Description

      Example:

      create or replace temp view v1 as
      select * from values
      (array(1, 2, 3, 4), 5, 5),
      (array(1, 2, 3, 4), 5, null)
      as v1(col1,col2,col3);
      
      select array_insert(col1, col2, col3) from v1;
      

      This produces an incorrect result:

      [1,2,3,4,5]
      [1,2,3,4,0] <== should be [1,2,3,4,null]
      

      A more succint example:

      select array_insert(array(1, 2, 3, 4), 5, cast(null as int));
      

      This also produces an incorrect result:

      [1,2,3,4,0] <== should be [1,2,3,4,null]
      

      Another example:

      create or replace temp view v1 as
      select * from values
      (array('1', '2', '3', '4'), 5, '5'),
      (array('1', '2', '3', '4'), 5, null)
      as v1(col1,col2,col3);
      
      select array_insert(col1, col2, col3) from v1;
      

      The above query throws a NullPointerException:

      23/02/10 11:08:05 ERROR SparkSQLDriver: Failed in [select array_insert(col1, col2, col3) from v1]
      java.lang.NullPointerException
      	at org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:110)
      	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
      	at org.apache.spark.sql.execution.LocalTableScanExec.$anonfun$unsafeRows$1(LocalTableScanExec.scala:44)
      

      array_append has the same issue:

      spark-sql> select array_append(array(1, 2, 3, 4), cast(null as int));
      [1,2,3,4,0] <== should be [1,2,3,4,null]
      Time taken: 3.679 seconds, Fetched 1 row(s)
      spark-sql> select array_append(array('1', '2', '3', '4'), cast(null as string));
      23/02/10 11:13:36 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
      java.lang.NullPointerException
      	at org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:110)
      	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.project_doConsume_0$(Unknown Source)
      	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
      

      Attachments

        Activity

          People

            bersprockets Bruce Robbins
            bersprockets Bruce Robbins
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: