Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-39293

The accumulator of ArrayAggregate should copy the intermediate result if string, struct, array, or map

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 3.0.3, 3.1.2, 3.2.1, 3.3.0
    • 3.1.3, 3.0.4, 3.3.0, 3.2.2
    • SQL

    Description

      The accumulator of ArrayAggregate should copy the intermediate result if string, struct, array, or map.

      import org.apache.spark.sql.functions._
      
      val reverse = udf((s: String) => s.reverse)
      
      val df = Seq(Array("abc", "def")).toDF("array")
      val testArray = df.withColumn(
        "agg",
        aggregate(
          col("array"),
          array().cast("array<string>"),
          (acc, s) => concat(acc, array(reverse(s)))))
      
      aggArray.show(truncate=false)
      

      should be:

      +----------+----------+
      |array     |agg       |
      +----------+----------+
      |[abc, def]|[cba, fed]|
      +----------+----------+
      

      but:

      +----------+----------+
      |array     |agg       |
      +----------+----------+
      |[abc, def]|[fed, fed]|
      +----------+----------+
      

      Attachments

        Activity

          People

            ueshin Takuya Ueshin
            ueshin Takuya Ueshin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: