Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-25292

to_unix_timestamp & unix_timestamp should support ENGLISH format by default

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Clients

    Description

      Hei

      The to_unix_timestamp function is implemented by GenericUDFToUnixTimeStamp. It uses SimpleDateFormat to parse the time of the string type.

      But SimpleDateFormat does not specify the Locale parameter, that is, the default locale of the jvm machine will be used. This will cause some non-English local machines to be unable to run similar sql like :

       

      hive> select to_unix_timestamp('16/Mar/2017:12:25:01', 'dd/MMM/yyy:HH:mm:ss');
      OK
      NULL
      hive> select unix_timestamp('16/Mar/2017:12:25:01', 'dd/MMM/yyy:HH:mm:ss');
      OK
      NULL
      

       

      At the same time, I found that in spark, to_unix_timestamp & unix_timestamp also use SimpleDateFormat, and spark uses Locale.US by default, but this will make it impossible to use local language syntax. For example, in the Chinese environment, I can parse this result correctly in hive,

       

      hive> select to_unix_timestamp('16/三月/2017:12:25:01', 'dd/MMMM/yyy:HH:mm:ss');
      OK
      1489638301
      Time taken: 0.147 seconds, Fetched: 1 row(s)
      OK
      

      But spark will return Null.

      Because English dates are more common dates, I think two SimpleDateFormats are needed. The new SimpleDateFormat is initialized with the Locale.ENGLISH parameter.

       

      Attachments

        Activity

          People

            shezhiming shezm
            shezhiming shezm
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 20m
                1h 20m