Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.0
-
None
Description
When joining a partitioned parquet table with another table by partition column it should prunne partitions from partitioned table based on another table values.
example:
tableA parquet table partitioned be part_filter
tableB table with column with partition values
tableA is partitioned by part_A,part_B,part_C,part_D
tableB is a single column with 2 rows having part_A and part_B as values.
doing
select * from tableA inner join tableB on tableA.part_filter=tableB.part_filter
should generate a partition prunning on tableA based on tableB values (in this case scanning only 2 partitions) but it wll read all 4 partitions from tableA only filter the results.
note: this kind of approach works on Hive (filtering tableA partitions)