Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22500

sort by and timestamp casting filters out too many rows

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.1.0
    • Fix Version/s: None
    • Component/s: Hive
    • Labels:
      None

      Description

      Consider this example.

      Preparation: 

      create temporary table opens as (
        select stack(1, 
               1 , cast('2019-11-13 08:07:28' as timestamp)
         ) as (id , load_ts )
      );
      

       

      Queries: This is just about counting the number of rows, with filters always matching, and possibly sort by. 1 is always expected.

      select count(*) from ( select * from opens) t;
      select count(*) from ( select * from opens sort by id) t;
      
      select count(*) from ( select * from opens where load_ts >= '2019-11-13 08:07:00' ) t;
      select count(*) from ( select * from opens where load_ts >= '2019-11-13 08:07:00' sort by id) t;
      
      select count(*) from ( select * from opens where load_ts <= '2019-11-13 09:07:00') t;
      select count(*) from ( select * from opens where load_ts <= '2019-11-13 09:07:00' sort by id) t;
      

       The last query (sort by and <= on timestamp) returns 0 rows.

      I believe that this is the cause of other issues I have, where I have missing rows in queries with the timestamp (but not the explicit sort by).

      Note that if instead of a temporary table I use a CTE for opens, the issue does not appear.

      I tried workarounds (inverse order of operands, adding not or not not ) to no avail.

      One thing that did work is to explicitly cast the string to a timestamp:

      select count(*) from ( select * from opens where load_ts <= cast('2019-11-13 09:07:00' as timestamp) sort by id) t;

      It might be good practice indeed, but there still is a discrepancy between how >= and <= are handled, or how sort by works.

      Note: this is on Hive from [hdp3.1.4|https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/release-notes/content/patch_hive.html] , without llap.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              guillaume@lomig.net Guillaume
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: