Description
Using Spark 2.4.5, write pre-1582 date to ORC file and then read it:
$ export TZ=UTC $ bin/spark-shell --conf spark.sql.session.timeZone=UTC Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.4.5-SNAPSHOT /_/ Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_161) Type in expressions to have them evaluated. Type :help for more information. scala> sql("select cast('1200-01-01' as date) dt").write.mode("overwrite").orc("/tmp/datefile") scala> spark.read.orc("/tmp/datefile").show +----------+ |dt | +----------+ |1200-01-01| +----------+ scala> :quit
Using Spark 3.0 (branch-3.0 at commit a934142f24), read the same file:
$ export TZ=UTC $ bin/spark-shell --conf spark.sql.session.timeZone=UTC Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.0.0-SNAPSHOT /_/ Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_161) Type in expressions to have them evaluated. Type :help for more information. scala> spark.read.orc("/tmp/datefile").show +----------+ |dt | +----------+ |1200-01-08| +----------+ scala>
Dates are off.
Timestamps, on the other hand, appear to work as expected.