Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44101 Support pandas 2
  3. SPARK-43295

Make DataFrameGroupBy.sum support for string type columns

    XMLWordPrintableJSON

Details

    Description

      From pandas 2.0.0, DataFrameGroupBy.sum also works for string type columns:

      >>> psdf
         A    B  C      D
      0  1  3.1  a   True
      1  2  4.1  b  False
      2  1  4.1  b  False
      3  2  3.1  a   True
      >>> psdf.groupby("A").sum().sort_index()
           B  D
      A
      1  7.2  1
      2  7.2  1
      >>> psdf.to_pandas().groupby("A").sum().sort_index()
           B   C  D
      A
      1  7.2  ab  1
      2  7.2  ba  1 

      Attachments

        Issue Links

          Activity

            People

              itholic Haejoon Lee
              itholic Haejoon Lee
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: