Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
-
None
-
Patch, Important
Description
Currently, ORC supports filtering at: File, Stripe, and row group level.
There is an on-going effort to add more detailed row-level filters using filter Predicates as part of the Reader.Options as part of ORC-577.
However, there are still cases where the framework implementing the TreeReader interface wants to skip particular rows without using Predicates (e.g., simply using indexes for rows to be skipped), to avoid expensive type Decode i.e DecimalColumnVector or Decimal64ColumnVector type.
In this ticket I propose to support extend the TreeReader abstract class with an extra method next Vector method.
abstract void nextVector(ColumnVector previous, boolean[] isNull, boolean[] skipRows, final int batchSize)
The subclasses implementing this method will be able to use the (existing) skipRows method to avoid expensive decoding when needed given the skipRows array argument.
Attachments
Issue Links
- duplicates
-
ORC-577 Allow row-level filtering
- Closed
- links to