Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14567

After enabling Hive Parquet Vectorization, POWER_TEST of query24 in TPCx-BB(BigBench) failed with 1TB scale factor, but successful with 3TB scale factor

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Later
    • 2.2.0
    • None
    • File Formats, Hive
    • Apache Hadoop2.6.0
      Apache Hive2.2.0
      JDK1.8.0_73
      TPCx-BB 1.0.1

    Description

      We use TPCx-BB(BigBench) to evaluate the performance of Hive Parquet Vectorization in our local cluster(E5-2699 v3, 256G, 72 vcores, 1 master node + 5 worker nodes). During our performance test, we found that query24 in TPCx-BB failed with 1TB scale factor, but it is successful with 3TB scale factor on the same conditions. We retried with 100GB/10GB/1GB scale factor, they all failed. That is to say, with smaller data scale it fails but larger data scale successes, which seems very unusual.

      The failed log listed below:
      Diagnostic Messages for this Task:
      Error: java.io.IOException: java.io.IOException: java.lang.IllegalArgumentException: 8 > 4
      at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
      at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
      at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:230)
      at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:140)
      at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
      at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
      at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
      at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
      Caused by: java.io.IOException: java.lang.IllegalArgumentException: 8 > 4
      at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
      at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
      at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:357)
      at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:106)
      at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:42)
      at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:118)
      at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:228)
      ... 11 more
      Caused by: java.lang.IllegalArgumentException: 8 > 4
      at java.util.Arrays.copyOfRange(Arrays.java:3519)
      at org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat$VectorizedParquetRecordReader.assignVector(VectorizedParquetInputFormat.java:315)
      at org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat$VectorizedParquetRecordReader.next(VectorizedParquetInputFormat.java:237)
      at org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat$VectorizedParquetRecordReader.next(VectorizedParquetInputFormat.java:97)
      at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:352)
      ... 15 more

      FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
      MapReduce Jobs Launched:
      Stage-Stage-2: Map: 3 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL

      Attachments

        Activity

          People

            Unassigned Unassigned
            KaiXu KaiXu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: