Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-26801

Query based compaction fails on tables having columns with keywords(i.e. row in this case)

    XMLWordPrintableJSON

Details

    Description

      Query-based compaction fails on tables having columns with keywords(i.e. row in this case) for columns. The compaction fails while running insert into statement as it does not quote the columns correctly.

      Below are the steps to reproduce the issue.

      CREATE TABLE aggregated_data(`sessionid` string,`row` int,`timeofoccurrence` bigint);
      insert into table aggregated_data values ("abcd",300,21111111111);
      insert into table aggregated_data values ("abcd",300,21111111111);
      alter table aggregated_data compact 'MAJOR' and wait;

      Error - 

      2022-11-30 13:04:33,309 INFO  org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor: [repro894918]: Running major compaction via query: INSERT into table default_tmp_compactor_aggregated_data_1669813472898 select validate_acid_sort_order(ROW__ID.writeId, ROW__ID.bucketId, ROW__ID.rowId), ROW__ID.writeId, ROW__ID.bucketId, ROW__ID.rowId, ROW__ID.writeId, NAMED_STRUCT('sessionid', sessionid, 'row', row, 'timeofoccurrence', timeofoccurrence)  from default.aggregated_data
      2022-11-30 13:04:33,309 INFO  org.apache.hadoop.hive.ql.Driver: [repro894918]: Compiling command(queryId=hive_20221130130433_de2a8b2d-f993-44e5-8aeb-decba3342a85): INSERT into table default_tmp_compactor_aggregated_data_1669813472898 select validate_acid_sort_order(ROW__ID.writeId, ROW__ID.bucketId, ROW__ID.rowId), ROW__ID.writeId, ROW__ID.bucketId, ROW__ID.rowId, ROW__ID.writeId, NAMED_STRUCT('sessionid', sessionid, 'row', row, 'timeofoccurrence', timeofoccurrence)  from default.aggregated_data
      2022-11-30 13:04:33,314 ERROR org.apache.hadoop.hive.ql.Driver: [repro894918]: FAILED: ParseException line 1:277 cannot recognize input near 'row' ',' ''timeofoccurrence'' in select expression
      org.apache.hadoop.hive.ql.parse.ParseException: line 1:277 cannot recognize input near 'row' ',' ''timeofoccurrence'' in select expression
              at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:128)
              at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:82)
              at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)
              at org.apache.hadoop.hive.ql.Compiler.parse(Compiler.java:173)
              at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:102)
              at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:196)
              at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:615)
              at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:673)
              at org.apache.hadoop.hive.ql.Driver.run(Driver.java:505)
              at org.apache.hadoop.hive.ql.Driver.run(Driver.java:494)
              at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:70)
              at org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor.runCompactionQueries(QueryCompactor.java:133)
              at org.apache.hadoop.hive.ql.txn.compactor.MajorQueryCompactor.runCompaction(MajorQueryCompactor.java:63)
              at org.apache.hadoop.hive.ql.txn.compactor.Worker.findNextCompactionAndExecute(Worker.java:562)
              at org.apache.hadoop.hive.ql.txn.compactor.Worker.lambda$run$0(Worker.java:113)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:750)

      Attachments

        Issue Links

          Activity

            People

              Gangadharan Gopinath
              Gangadharan Gopinath
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m