Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
9.3
-
None
-
None
Description
Starting with Solr 9.4, configuring a DenseVectorField w/ vectorDimension > 1024 no longer works by default. There is no error on startup, but when indexing you'll get errors like...
2> => org.apache.solr.common.SolrException: Exception writing document id 2 to the index; possible analysis error: Field [vector]vector's dimensions must be <= [1024]; got 1600 2> at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:329) 2> org.apache.solr.common.SolrException: Exception writing document id 2 to the index; possible analysis error: Field [vector]vector's dimensions must be <= [1024]; got 1600 2> at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:329) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] 2> at org.apache.solr.update.processor.RunUpdateProcessorFactory$RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:76) ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 - stillalex - 2023-10-10 19:10:39] ... 2> Caused by: java.lang.IllegalArgumentException: Field [vector]vector's dimensions must be <= [1024]; got 1600 2> at org.apache.lucene.index.IndexingChain.validateMaxVectorDimension(IndexingChain.java:843) ~[lucene-core-9.8.0.jar:9.8.0 d914b3722bd5b8ef31ccf7e8ddc638a87fd648db - 2023-09-21 21:57:47] ...
This is because Lucene 9.8 moved the dimension size limitation to the codec – and while Solr 9.4's SchemaCodecFactory was updated to implement a per-field SolrDelegatingKnnVectorsFormat that respected the vectorDimension configured for each DenseVectorField the SchemaCodecFactory is not implicitly used by default in Solr – nor is it explicitly configured in the _default configset.
Existing DenseVectorField who encounter this error when upgrading to Solr >= 9.4, or new user attempting to use DenseVectorField with vectorDimension > 1024, need to explicitly configure SchemaCodecFactory in the solrconfig.xml for each collection.
Attachments
Issue Links
- is caused by
-
SOLR-16985 Upgrade Lucene to 9.8.0
- Closed
- is fixed by
-
SOLR-17046 SchemaCodecFactory should be the implicit default if no <codeFactory/> is configured
- Closed
- is related to
-
SOLR-17052 SchemaCodecFactory/IndexSchema/FieldType relationships are kludgy, buggy, and inefficient
- Open
- relates to
-
SOLR-17047 (SolrCore's) CodecFactory validation ignores schema based KnnVectorsFormat options on init
- Open