Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21509

LLAP may cache corrupted column vectors and return wrong query result

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.0.0-alpha-1
    • llap
    • None

    Description

      In some scenarios, LLAP might store column vectors in cache that are getting reused and reset just before their original content would be written.

      The issue is a concurrency issue and is thereby flaky. It is not easy to reproduce, but the odds of surfacing this issue can by improved by setting LLAP executor and IO thread counts this way:

      • set hive.llap.daemon.num.executors=32;
      • set hive.llap.io.threadpool.size=1;
      • using TPCDS input data of store_sales table, have at least a couple of 100k's of rows, and use text format:
      ROW FORMAT SERDE    'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'  WITH SERDEPROPERTIES (    'field.delim'='|',    'serialization.format'='|')  STORED AS INPUTFORMAT    'org.apache.hadoop.mapred.TextInputFormat'  OUTPUTFORMAT    'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
      • having more splits increases the issue showing itself, so it is worth to set tez.grouping.min-size=1024; set tez.grouping.max-size=1024;
      • run query on this this table: select min(ss_sold_date_sk) from store_sales;

      The first query result is correct (2450816 in my case). Repeating the query will trigger reading from LLAP cache and produce a wrong result: 0.

      If one wants to make sure of running into this issue, place a Thread.sleep(250) at the beginning of VectorDeserializeOrcWriter#run().

       

      Attachments

        1. HIVE-21509.0.wip.patch
          4 kB
          Ádám Szita
        2. HIVE-21509.1.wip.patch
          4 kB
          Ádám Szita
        3. HIVE-21509.2.patch
          11 kB
          Ádám Szita
        4. HIVE-21509.3.patch
          11 kB
          Ádám Szita
        5. HIVE-21509.4.patch
          11 kB
          Ádám Szita

        Activity

          People

            szita Ádám Szita
            szita Ádám Szita
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: