Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.2.0
-
None
-
None
Description
In Spark 3.2 (Scala 2.13), values with ArrayType are no longer properly support with JSON; e.g.
import org.apache.spark.sql.SparkSession case class KeyValue(key: String, value: Array[Byte]) val spark = SparkSession.builder().master("local[1]").appName("test").getOrCreate() import spark.implicits._ val df = Seq(Array(KeyValue("foo", "bar".getBytes))).toDF() df.foreach(r => println(r.json))
Expected:
[{foo, bar}]
Encountered:
java.lang.IllegalArgumentException: Failed to convert value ArraySeq([foo,[B@dcdb68f]) (class of class scala.collection.mutable.ArraySeq$ofRef}) with the type of ArrayType(Seq(StructField(key,StringType,false), StructField(value,BinaryType,false)),true) to JSON. at org.apache.spark.sql.Row.toJson$1(Row.scala:604) at org.apache.spark.sql.Row.jsonValue(Row.scala:613) at org.apache.spark.sql.Row.jsonValue$(Row.scala:552) at org.apache.spark.sql.catalyst.expressions.GenericRow.jsonValue(rows.scala:166) at org.apache.spark.sql.Row.json(Row.scala:535) at org.apache.spark.sql.Row.json$(Row.scala:535) at org.apache.spark.sql.catalyst.expressions.GenericRow.json(rows.scala:166)