Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-25316

Map partition key columns when pushing TNK op through select

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      The following TPC-DS query fails at runtime when the table store_sales is an external JDBC table.

      SELECT ranking
      FROM
          (SELECT rank() OVER (PARTITION BY ss_store_sk
              ORDER BY sum(ss_net_profit)) AS ranking
           FROM store_sales
           GROUP BY ss_store_sk) tmp1
      WHERE ranking <= 5
      

      The stacktrace below shows that problem occurs while trying to initialize the TopNKeyOperator.

      2021-07-08T09:04:37,444 ERROR [TezTR-270335_1_3_0_0_0] tez.TezProcessor: Failed initializeAndRunProcessor
      java.lang.RuntimeException: Map operator initialization failed
              at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:310) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:277) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) [tez-runtime-internals-0.10.0.jar:0.10.0]
              at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) [tez-runtime-internals-0.10.0.jar:0.10.0]
              at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) [tez-runtime-internals-0.10.0.jar:0.10.0]
              at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_261]
              at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_261]
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) [hadoop-common-3.1.0.jar:?]
              at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) [tez-runtime-internals-0.10.0.jar:0.10.0]
              at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) [tez-runtime-internals-0.10.0.jar:0.10.0]
              at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) [tez-common-0.10.0.jar:0.10.0]
              at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) [hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_261]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_261]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_261]
              at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]
      Caused by: java.lang.RuntimeException: cannot find field _col0 from [0:ss_store_sk, 1:$f1]
              at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:550) ~[hive-serde-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:153) ~[hive-serde-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:56) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.ql.exec.TopNKeyOperator.initObjectInspectors(TopNKeyOperator.java:101) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.ql.exec.TopNKeyOperator.initializeOp(TopNKeyOperator.java:82) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:506) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:314) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
              ... 16 more
      

      Attachments

        1. external_jdbc_table_perf2.q
          43 kB
          Stamatis Zampetakis

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            kkasa Krisztian Kasa Assign to me
            zabetak Stamatis Zampetakis
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 0.5h
              0.5h

              Slack

                Issue deployment