Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-16977

DenseVector queries fail with unclear error message when either query or document is an all 0’s vector

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 9.2.1
    • None
    • query
    • None
    • Solr 9.2.1

    Description

      When doing a dense vector search, if the query passes in an all-zero vector Solr will fail with uninformative error message.

       

      e.g.

      q={!knn f=vector topK=1}[0.0, 0.0, 0.0, 0.0]

       

      Error Message from solr:

      {'error': {'msg': 'docID must be >= 0 and < maxDoc=2 (got docID=2147483647)', 'trace': 'java.lang.IllegalArgumentException: docID must be >= 0 and < maxDoc=2 (got docID=2147483647)\n\tat org.apache.lucene.index.BaseCompositeReader.readerIndex(BaseCompositeReader.java:225)\n\tat org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:153)\n\tat org.apache.solr.search.SolrDocumentFetcher.docNC(SolrDocumentFetcher.java:274)\n\tat org.apache.solr.search.SolrDocumentFetcher.lambda$doc$0(SolrDocumentFetcher.java:258)\n\tat org.apache.solr.search.CaffeineCache.computeAsync(CaffeineCache.java:234)\n\tat org.apache.solr.search.CaffeineCache.computeIfAbsent(CaffeineCache.java:250)\n\tat org.apache.solr.search.SolrDocumentFetcher.doc(SolrDocumentFetcher.java:258)\n\tat org.apache.solr.search.SolrDocumentFetcher$RetrieveFieldsOptimizer.getSolrDoc(SolrDocumentFetcher.java:855)\n\tat org.apache.solr.search.SolrDocumentFetcher.solrDoc(SolrDocumentFetcher.java:307)\n\tat org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:94)\n\tat org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:56)\n\tat org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:257)\n\tat org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:196)\n\tat org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:47)\n\tat org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:403)\n\tat org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:311)\n\tat org.apache.solr.response.JSONWriter.writeResponse(JSONWriter.java:77)\n\tat org.apache.solr.response.JSONResponseWriter.write(JSONResponseWriter.java:63)\n\tat org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:71)\n\tat org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:988)\n\tat org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:593)\n\tat org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:252)\n\tat org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter.java:220)\n\tat org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:257)\n\tat org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:227)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:215)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197)\n\tat org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:210)\n\tat org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:578)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1570)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1383)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1543)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1305)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:149)\n\tat org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:228)\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:141)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)\n\tat org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:301)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)\n\tat org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:822)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:563)\n\tat org.eclipse.jetty.server.HttpChannel.lambda$handle$0(HttpChannel.java:505)\n\tat org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:762)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:497)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:282)\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)\n\tat org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:558)\n\tat org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:379)\n\tat org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:146)\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)\n\tat org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)\n\tat org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:416)\n\tat org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:385)\n\tat org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:272)\n\tat org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:140)\n\tat org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:934)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1078)\n\tat java.base/java.lang.Thread.run(Thread.java:833)\n', 'code': 500}}

       

      Similarly, even if query contains non-zero vector but a document contains an all zero embedding vector similar error will occur.  None of the test cases https://github.com/apache/solr/blob/bee3accb8a38b7420da0919ebacf05ef6060fc94/solr/core/src/test/org/apache/solr/search/neural/KnnQParserTest.java#L420 have any examples with all zeros vector.

       

      This is tested on solr 9.2.1 with a DenseVector field using cosine similarity.  Cosine similarity divides by the vector norm and in the case of an all 0s vector that norm will be zero.  Still, this issue of an all zero vector can occur from underflow issues when generating the embedding vector.

       

      Minimal example in python:

      solr_vector_error_example-1.py

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            ashum Alex

            Dates

              Created:
              Updated:

              Slack

                Issue deployment