Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17458 VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
  3. HIVE-17943

select ROW__ID, t, si, i from over10k_orc_bucketed where b = 4294967363 and t < 100 order by ROW__ID fails on LLAP

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Cannot Reproduce
    • None
    • 4.0.0-alpha-2
    • Transactions
    • None

    Description

      see acid_vectorization_original.q from HIVE-17458
      mvn test -Dtest=TestMiniLlapCliDriver -Dqfile=acid_vectorization_original.q

      generates

      2017-10-30T14:46:42,516  INFO [TezTR-936982_1_38_0_0_0] lib.MRReaderMapred: Processing split: TezGroupedSplit{wrappedSplits=[org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit [hdfs\
      ://localhost:65102/build/ql/test/data/warehouse/over10k_orc_bucketed/000003_0, start=3, length=7242, isOriginal=true, fileLength=7855, hasFooter=false, hasBase=true, deltas=0], org.apache\
      .hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit [hdfs://localhost:65102/build/ql/test/data/warehouse/over10k_orc_bucketed/000003_0_copy_1, start=3, length=7242, isOriginal=true, fileLength\
      =7855, hasFooter=false, hasBase=true, deltas=0], org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit [hdfs://localhost:65102/build/ql/test/data/warehouse/over10k_orc_bucketed/000000_\
      0, start=3, length=7003, isOriginal=true, fileLength=7658, hasFooter=false, hasBase=true, deltas=0], org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit [hdfs://localhost:65102/build\
      /ql/test/data/warehouse/over10k_orc_bucketed/000000_0_copy_1, start=3, length=7003, isOriginal=true, fileLength=7658, hasFooter=false, hasBase=true, deltas=0], org.apache.hadoop.hive.ql.i\
      o.orc.OrcInputFormat:OrcSplit [hdfs://localhost:65102/build/ql/test/data/warehouse/over10k_orc_bucketed/000001_0, start=3, length=6961, isOriginal=true, fileLength=7611, hasFooter=false, \
      hasBase=true, deltas=0], org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit [hdfs://localhost:65102/build/ql/test/data/warehouse/over10k_orc_bucketed/000001_0_copy_1, start=3, lengt\
      h=6961, isOriginal=true, fileLength=7611, hasFooter=false, hasBase=true, deltas=0], org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit [hdfs://localhost:65102/build/ql/test/data/war\
      ehouse/over10k_orc_bucketed/000002_0, start=3, length=6637, isOriginal=true, fileLength=7262, hasFooter=false, hasBase=true, deltas=0], org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:Orc\
      Split [hdfs://localhost:65102/build/ql/test/data/warehouse/over10k_orc_bucketed/000002_0_copy_1, start=3, length=6637, isOriginal=true, fileLength=7262, hasFooter=false, hasBase=true, del\
      tas=0]], wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat', locations=[127.0.0.1], rack='null', length=55686}
      2017-10-30T14:46:42,516  INFO [IO-Elevator-Thread-3] LlapIoImpl: Processing data for file 16896: hdfs://localhost:65102/build/ql/test/data/warehouse/over10k_orc_bucketed/000003_0
      2017-10-30T14:46:42,516  INFO [TezTR-936982_1_38_0_0_0] input.MRInput: over10k_orc_bucketed initialized RecordReader from event
      2017-10-30T14:46:42,516 DEBUG [IO-Elevator-Thread-3] LlapIoOrc: FileSplit {3, 7242}; stripes {3, 7242},
      2017-10-30T14:46:42,517 DEBUG [TezTR-936982_1_38_0_0_0] tez.MapRecordProcessor: Starting Output: Reducer 2
      2017-10-30T14:46:42,518  INFO [TezTR-936982_1_38_0_0_0] impl.ExternalSorter: Reducer 2 using: memoryMb=24, keySerializerClass=class org.apache.hadoop.hive.ql.io.HiveKey, valueSerializerCl\
      ass=org.apache.tez.runtime.library.common.serializer.TezBytesWritableSerialization$TezBytesWritableSerializer@54f817f7, comparator=org.apache.tez.runtime.library.common.comparator.TezByte\
      sComparator@367d42fb, partitioner=org.apache.tez.mapreduce.partition.MRPartitioner, serialization=org.apache.tez.runtime.library.common.serializer.TezBytesWritableSerialization,org.apache\
      .tez.runtime.library.common.serializer.TezBytesWritableSerialization,org.apache.hadoop.io.serializer.WritableSerialization, org.apache.hadoop.io.serializer.avro.AvroSpecificSerialization,\
       org.apache.hadoop.io.serializer.avro.AvroReflectSerialization, reportPartitionStats=MEMORY_OPTIMIZED
      2017-10-30T14:46:42,518  INFO [TezTR-936982_1_38_0_0_0] impl.PipelinedSorter: Setting up PipelinedSorter for Reducer 2: , UsingHashComparator=true
      2017-10-30T14:46:42,520 DEBUG [IO-Elevator-Thread-3] LlapIoImpl: setError called; current state closed false, done false, err null, pending 0
      2017-10-30T14:46:42,520  WARN [IO-Elevator-Thread-3] LlapIoImpl: setError called with an error
      java.lang.ArrayIndexOutOfBoundsException: 17
              at org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readIndexStreams(EncodedReaderImpl.java:1797) ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.readStripesMetadata(OrcEncodedDataReader.java:681) [hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:327) [hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:272) [hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:269) [hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
              at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_25]
              at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_25]
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807) [hadoop-common-2.8.1.jar:?]
              at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:269) [hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
              at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109) [hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
              at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) [tez-common-0.9.0.jar:0.9.0]
              at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) [hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
              at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_25]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_25]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_25]
              at java.lang.Thread.run(Thread.java:745) [?:1.8.0_25]
      2017-10-30T14:46:42,521  INFO [IO-Elevator-Thread-3] encoded.OrcEncodedDataReader: Dumping LLAP IO trace; 0 bytes
      2017-10-30T14:46:42,521 DEBUG [IO-Elevator-Thread-3] impl.StatsRecordingThreadPool: Updated stats: instance: OrcEncodedDataReader thread name: IO-Elevator-Thread-3 thread id: 3110 scheme:\
       file bytesRead: 0 bytesWritten: 0 readOps: 0 largeReadOps: 0 writeOps: 0
      2017-10-30T14:46:42,521 DEBUG [IO-Elevator-Thread-3] impl.StatsRecordingThreadPool: Updated stats: instance: OrcEncodedDataReader thread name: IO-Elevator-Thread-3 thread id: 3110 scheme:\
       hdfs bytesRead: 0 bytesWritten: 0 readOps: 0 largeReadOps: 0 writeOps: 0
      

      Attachments

        Activity

          People

            ekoifman Eugene Koifman
            ekoifman Eugene Koifman
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: