Details
-
Improvement
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
Join elimination is a useful optmize improvement.
Consider a query that joins the two tables but does not make use of the Dept columns:
SELECT Emp.name, Emp.salary FROM Emp, Dept WHERE Emp.deptno = Dept.dno
Assuming Emp.deptno is the foreign-key and is non-null, Dept.dno is the unique-key. The sql above can be rewritten as following. remove the Dept table without affecting the resultset.
SELECT Emp.name, Emp.salary FROM Emp
Without redundant join elimination, this query execution may perform poorly.
The optimize improvement is also available in SQL Server, Oracle and Snowflake and so on.
In Calcite, i think that is also useful. The infrastructure that join elimination depend on is already available.
The main steps are as follows:
1. Analyse the column used by project, and then split them to left and right side.
2. Acccording to the project info above and outer join type, bail out in some scene.
3. Get join info such as join keys.
4. For inner join check foreign and unique keys, these may use
RelMetadataQuery#getForeignKeys(newly add, similar to RelMetadataQuery#getUniqueKeys),
RelOptTable#getReferentialConstraints.
5. Check removing side join keys are areColumnsUnique both for outer join and inner join.
6. If all done, calculate the fianl project and transform.
Please help me to check the improvement whether is useful or not.
And i would like to add this improvement to Calcite.
Attachments
Issue Links
- depends upon
-
CALCITE-5881 Support to get foreign keys metadata in RelMetadataQuery
- In Progress
- is related to
-
CALCITE-1731 Rewriting of queries using materialized views with joins and aggregates
- Closed
- links to