Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-27059

Wrong object inspector will be created when use collect_list and disable map-side aggregation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.3.8, 3.1.3, 4.0.0-alpha-2
    • None
    • Query Planning

    Description

      Query will fail when use collect_list (or collect_set) and disable map-side aggregationg:

      Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryMap cannot be cast to java.util.Map
              at org.apache.hadoop.hive.serde2.objectinspector.StandardMapObjectInspector.getMap(StandardMapObjectInspector.java:85)
              at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:437)
              at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:362)
              at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMkCollectionEvaluator.putIntoCollection(GenericUDAFMkCollectionEvaluator.java:154)
              at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMkCollectionEvaluator.iterate(GenericUDAFMkCollectionEvaluator.java:120)
              at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:192)
              at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:638)
              at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:877)
              at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:721)
              at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:787)
      

      To reproduce this issue:

      create table tb1 (a int, b string, c string);
      insert into tb1 values (1, "100", "101");
      insert into tb1 values (1, "102", "103");
      insert into tb1 values (2, "200", "201");
      set hive.map.aggr=false;
      select a, collect_list(map("b",b,"c",c)) as col1 from tb1 group by a;
      select a, collect_set(array(b, c)) as col1 from tb1 group by a;
      

      To work around this issue:

      set hive.map.aggr=true;
      

      Attachments

        Issue Links

          Activity

            People

              uncleGen Genmao Yu
              uncleGen Genmao Yu
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h