Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34319

Self-join after cogroup applyInPandas fails due to unresolved conflicting attributes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0, 3.0.1, 3.1.0, 3.2.0
    • 3.0.2, 3.1.1
    • SQL
    • None

    Description

       

      df = spark.createDataFrame([(1, 1)], ("column", "value"))row = df.groupby("ColUmn").cogroup(
          df.groupby("COLUMN")
      ).applyInPandas(lambda r, l: r + l, "column long, value long")
      row.join(row).show()
      
      Conflicting attributes: column#163321L,value#163322L
      ;;
      ’Join Inner
      :- FlatMapCoGroupsInPandas [ColUmn#163312L], [COLUMN#163312L], <lambda>(column#163312L, value#163313L, column#163312L, value#163313L), [column#163321L, value#163322L]
      :  :- Project [ColUmn#163312L, column#163312L, value#163313L]
      :  :  +- LogicalRDD [column#163312L, value#163313L], false
      :  +- Project [COLUMN#163312L, column#163312L, value#163313L]
      :     +- LogicalRDD [column#163312L, value#163313L], false
      +- FlatMapCoGroupsInPandas [ColUmn#163312L], [COLUMN#163312L], <lambda>(column#163312L, value#163313L, column#163312L, value#163313L), [column#163321L, value#163322L]
         :- Project [ColUmn#163312L, column#163312L, value#163313L]
         :  +- LogicalRDD [column#163312L, value#163313L], false
         +- Project [COLUMN#163312L, column#163312L, value#163313L]
            +- LogicalRDD [column#163312L, value#163313L], false
      

       

       

      Attachments

        Issue Links

          Activity

            People

              Ngone51 wuyi
              Ngone51 wuyi
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: