Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3947

IndexOutOfBoundsException for pruning on date column (at large scale)

    XMLWordPrintableJSON

Details

    Description

      When a large table (about 52 B records, 10K files, created with CTAS auto-partitioning) partitioned by a 'date' column, partition pruning is encountering an error. At smaller scales, partition pruning succeeds. At this time, the problem seems specific to date columns only. This column is a nullable column and has NULL values in the data.

      Here's the query:

      explain plan for select count(*) from `table` where `date` = '2015-07-01';
      

      Here's the error stack:

      WARN  o.a.d.e.p.l.partition.PruneScanRule - Exception while trying to prune partition.
      java.lang.IndexOutOfBoundsException: index: 4096, length: 1 (expected: range(0, 4096))
              at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:189) ~[drill-java-exec-1.2.0.jar:4.0.27.Final]
              at io.netty.buffer.DrillBuf.chk(DrillBuf.java:211) ~[drill-java-exec-1.2.0.jar:4.0.27.Final]
              at io.netty.buffer.DrillBuf.setByte(DrillBuf.java:612) ~[drill-java-exec-1.2.0.jar:4.0.27.Final]
              at org.apache.drill.exec.vector.UInt1Vector$Mutator.set(UInt1Vector.java:411) ~[drill-java-exec-1.2.0.jar:1.2.0]
              at org.apache.drill.exec.vector.NullableDateVector$Mutator.set(NullableDateVector.java:440) ~[drill-java-exec-1.2.0.jar:1.2.0]
              at org.apache.drill.exec.store.parquet.ParquetGroupScan.populatePruningVector(ParquetGroupScan.java:420) ~[drill-java-exec-1.2.0.jar:1.2.0]
              at org.apache.drill.exec.planner.ParquetPartitionDescriptor.populatePartitionVectors(ParquetPartitionDescriptor.java:96) ~[drill-java-exec-1.2.0.jar:1.2.0]
              at org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:212) ~[drill-java-exec-1.2.0.jar:1.2.0]
              at org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) [drill-java-exec-1.2.0.jar:1.2.0]
              at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
              at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
      

      Attachments

        Activity

          People

            amansinha100 Aman Sinha
            amansinha100 Aman Sinha
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: