Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22216 Improving PySpark/Pandas interoperability
  3. SPARK-25640

Clarify/Improve EvalType for grouped aggregate and window aggregate

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.4.0
    • None
    • PySpark

    Description

      Currently, grouped aggregate and window aggregate uses different EvalType, however, they map to the same user facing type PandasUDFType.GROUPED_MAP.

      It makes sense to have one user facing type because it (PandasUDFType.GROUPED_MAP) can be used in both groupby and window operation.

      However, the mismatching between PandasUDFType and EvalType can be confusing to developers. We should clarify and/or improve this.

      See discussion at: https://github.com/apache/spark/pull/22620#discussion_r222452544

      Attachments

        Activity

          People

            Unassigned Unassigned
            icexelloss Li Jin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: