Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-40315

Non-deterministic hashCode() calculations for ArrayBasedMapData on equal objects

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.2
    • 3.1.4, 3.3.1, 3.2.3, 3.4.0
    • Spark Core
    • None

    Description

      There is no explicit `hashCode()` function override for the `ArrayBasedMapData` LogicalPlan. As a result, the `hashCode()` computed for `ArrayBasedMapData` can be different for two equal objects (objects with equal keys and values).

      This error is non-deterministic and hard to reproduce, as we don't control the default `hashCode()` function.

      We should override the `hashCode` function so that it works exactly as we expect. We should also have an explicit `equals()` function for consistency with how `Literals` check for equality of `ArrayBasedMapData`.

      Attachments

        Activity

          People

            c27kwan Carmen Kwan
            c27kwan Carmen Kwan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: