Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-40629

FLOAT/DOUBLE division by 0 gives Infinity/-Infinity/NaN in DataFrame but NULL in SparkSQL

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 3.2.1
    • None
    • Spark Shell, SQL
    • None

    Description

      Describe the bug

      Storing a FLOAT/DOUBLE value with division by 0 (e.g. ( 1.0/0 ).floatValue()) via spark-shell outputs Infinity. However, 1.0/0 (cast ( 1.0/0 as float)) evaluated to NULL if the value is inserted into a FLOAT/DOUBLE column of a table via spark-sql.

      To Reproduce

      On Spark 3.2.1 (commit 4f25b3f712), using spark-sql:

      $SPARK_HOME/bin/spark-sql

      Execute the following:

      spark-sql> create table float_vals(c1 float) stored as ORC;
      spark-sql> insert into float_vals select cast ( 1.0/0  as float);
      spark-sql> select * from float_vals;
      NULL

       

      Using spark-shell:

      $SPARK_HOME/bin/spark-shell

      Execute the following:

      scala> import org.apache.spark.sql.{Row, SparkSession}
      import org.apache.spark.sql.{Row, SparkSession}
      scala> import org.apache.spark.sql.types._
      import org.apache.spark.sql.types._
      scala> val rdd = sc.parallelize(Seq(Row(( 1.0/0 ).floatValue())))
      rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = ParallelCollectionRDD[180] at parallelize at <console>:28
      scala> val schema = new StructType().add(StructField("c1", FloatType, true) 
      )
      schema: org.apache.spark.sql.types.StructType = StructType( StructField(c1,FloatType,true))
      scala> val df = spark.createDataFrame(rdd, schema)
      df: org.apache.spark.sql.DataFrame = [c1: float]
      scala> df.show(false)
      +---------+
      |c1       |
      +---------+
      |Infinity |
      +---------+
      

      Expected behavior

      We expect the two Spark interfaces (spark-sql & spark-shell) to behave consistently for the same data type & input combination & configuration (FLOAT/DOUBLE and 1.0/0).

      Attachments

        Activity

          People

            Unassigned Unassigned
            x/sys xsys
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: