Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
ghx-label-8
Description
During scheduling Impala does the following:
- Non-Iceberg tables
- The scheduler processes the scan ranges in partition key order
- The scheduler selects N replicas as candidates
- The scheduler chooses the executor from the candidates based on minimum number of assigned bytes
- So consecutive partitions are more likely to be assigned to different executors
- Iceberg tables
- The scheduler processes the scan ranges in random order
- The scheduler selects N replicas as candidates
- The scheduler chooses the executor from the candidates based on minimum number of assigned bytes
- So consecutive partitions (by partition key order) are assigned randomly, i.e. there's a higher chances of clustering
If the IcebergScanNode ordered its file descriptors based on their paths we would have a more balanced scheduling for consecutive partitions. Queries that operate on a range of partitions are quite common, so it makes sense to optimize that case.