Details
-
Improvement
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
Description
If a SELECT contains a custom index expression (CASSANDRA-10217), that should always be chosen as the primary expression during query execution. Should the statement contain other expressions which can be satsfied by a built in index, we don't currently have the ability to apply the custom expression as a filter. What's more, the method of selecting which index to use is fairly primitive (and cannot be overridden until CASSANDRA-10214), so we should ensure that a custom expression, if present, is always chosen.
Suppose we have a custom index implementation which provides prefix matching on text fields.
CREATE TABLE ks.t (k int, v1 int, v2 text, PRIMARY KEY(k)); CREATE INDEX v1_idx ON ks.t(v1); CREATE CUSTOM INDEX v2_idx ON ks.t(v2) USING 'com.example.CustomIndex'; INSERT INTO ks.t(k, v1, v2) VALUES(0, 0, 'abc'); INSERT INTO ks.t(k, v1, v2) VALUES(1, 1, 'def'); SELECT * FROM ks.t WHERE v1=0 AND expr(v2_idx, 'd*') ALLOW FILTERING;
In the above example the expected result would contain no rows, which would be the case if v2_idx is selected as the primary (i.e. most selective) index during query execution. However, if v1_idx is chosen instead, the results of its lookup will have no further filter applied and so an incorrect result will be returned.
Note: this has always been something of an issue for custom indexes as the expressions they support may not be natively filterable by C*. For example, with the full text search syntax used by Stratio & DSE Search, if the custom index isn't selected the filtering will erroneously remove all rows as the value of the dummy column does not match the Lucene/Solr search expression literal. It's probably a fairly minor concern as in most cases a query using a custom index will not include other expressions (usually because custom indexes are per-row indexes, and so can support multi-field expression syntax). Also, an index implementation can return a very low number of estimated result count to try and ensure it is selected, custom expressions just provide an opportunity to improve the situation.