Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-36459

Date Value '0001-01-01' changes to '0001-12-30' when inserted into a parquet hive table

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.4.4
    • None
    • Spark Core, Spark Shell
    • None

    Description

      Hi All, 

      we are seeing this issue on spark 2.4.4. Below are the steps to reproduce it. 

      Login in to hive terminal on cluster and create below tables.

      create table t_src(dob timestamp);
      insert into t_src values('0001-01-01 00:00:00.0');
      create table t_tgt(dob timestamp) stored as parquet;

       

      Spark-shell steps :

      import org.apache.spark.sql.hive.HiveContext
      val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

      val q0 = "TRUNCATE table t_tgt"
      val q1 = "SELECT alias.dob as a0 FROM t_src alias"
      val q2 = "INSERT INTO TABLE t_tgt SELECT tbl0.a0 as c0 FROM tbl0"
      sqlContext.sql(q0)
      sqlContext.sql(q1).select("a0").createOrReplaceTempView("tbl0")
      sqlContext.sql(q2)

       

       After this check the contents of target table t_tgt. You will see the date "0001-01-01 00:00:00" changed to "0001-12-30 00:00:00".

       select * from t_tgt;

      Is this a known issue? Is it fixed in any subsequent releases?

      Thanks & regards,

      Sindhura Alluri

      Attachments

        Activity

          People

            Unassigned Unassigned
            salluri sindhura alluri
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: