Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Invalid
-
3.2.1
-
None
-
None
Description
Describe the bug
Storing a FLOAT/DOUBLE value with division by 0 (e.g. ( 1.0/0 ).floatValue()) via spark-shell outputs Infinity. However, 1.0/0 (cast ( 1.0/0 as float)) evaluated to NULL if the value is inserted into a FLOAT/DOUBLE column of a table via spark-sql.
To Reproduce
On Spark 3.2.1 (commit 4f25b3f712), using spark-sql:
$SPARK_HOME/bin/spark-sql
Execute the following:
spark-sql> create table float_vals(c1 float) stored as ORC; spark-sql> insert into float_vals select cast ( 1.0/0 as float); spark-sql> select * from float_vals; NULL
Using spark-shell:
$SPARK_HOME/bin/spark-shell
Execute the following:
scala> import org.apache.spark.sql.{Row, SparkSession} import org.apache.spark.sql.{Row, SparkSession} scala> import org.apache.spark.sql.types._ import org.apache.spark.sql.types._ scala> val rdd = sc.parallelize(Seq(Row(( 1.0/0 ).floatValue()))) rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = ParallelCollectionRDD[180] at parallelize at <console>:28 scala> val schema = new StructType().add(StructField("c1", FloatType, true) ) schema: org.apache.spark.sql.types.StructType = StructType( StructField(c1,FloatType,true)) scala> val df = spark.createDataFrame(rdd, schema) df: org.apache.spark.sql.DataFrame = [c1: float] scala> df.show(false) +---------+ |c1 | +---------+ |Infinity | +---------+
Expected behavior
We expect the two Spark interfaces (spark-sql & spark-shell) to behave consistently for the same data type & input combination & configuration (FLOAT/DOUBLE and 1.0/0).