Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31676

QuantileDiscretizer raise error parameter splits given invalid value (splits array includes -0.0 and 0.0)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.5, 3.0.0
    • 2.4.6, 3.0.0
    • ML
    • None

    Description

      Reproduce code

      
      import scala.util.Random
      val rng = new Random(3)
      
      val a1 = Array.tabulate(200)(_=>rng.nextDouble * 2.0 - 1.0) ++ Array.fill(20)(0.0) ++ Array.fill(20)(-0.0)
      
      import spark.implicits._
      val df1 = sc.parallelize(a1, 2).toDF("id")
      
      import org.apache.spark.ml.feature.QuantileDiscretizer
      val qd = new QuantileDiscretizer().setInputCol("id").setOutputCol("out").setNumBuckets(200).setRelativeError(0.0)
      
      val model = qd.fit(df1)
      
      

      Raise error like:

      at org.apache.spark.ml.param.Param.validate(params.scala:76)
      at org.apache.spark.ml.param.ParamPair.<init>(params.scala:634)
      at org.apache.spark.ml.param.Param.$minus$greater(params.scala:85)
      at org.apache.spark.ml.param.Params.set(params.scala:713)
      at org.apache.spark.ml.param.Params.set$(params.scala:712)
      at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:41)
      at org.apache.spark.ml.feature.Bucketizer.setSplits(Bucketizer.scala:77)
      at org.apache.spark.ml.feature.QuantileDiscretizer.fit(QuantileDiscretizer.scala:231)
      ... 49 elided
      java.lang.IllegalArgumentException: quantileDiscretizer_479bb5a3ca99 parameter splits given invalid value [-Infinity,-0.9986765732730827,..., -0.0, 0.0, ..., 0.9907184077958491,Infinity]

      0.0 > -0.0 is False, which break the paremater validation check.

      Attachments

        Activity

          People

            weichenxu123 Weichen Xu
            weichenxu123 Weichen Xu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: