[SOLR-11891] DocsStreamer populates SolrDocument w/unnecessary fields - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 5.4, 6.4.2, 6.6.2
Fix Version/s: 7.4, 8.0
Component/s: Response Writers
Labels:
None

Description

We observe that solr query time increases significantly with the number of rows requested, even all we retrieve for each document is just fl=id,score. Debugged a bit and see that most of the increased time was spent in BinaryResponseWriter, converting lucene document into SolrDocument. Inside convertLuceneDocToSolrDoc():

https://github.com/apache/lucene-solr/blob/df874432b9a17b547acb24a01d3491839e6a6b69/solr/core/src/java/org/apache/solr/response/DocsStreamer.java#L182

I am a bit puzzled why we need to iterate through all the fields in the document. Why can’t we just iterate through the requested field list?

https://github.com/apache/lucene-solr/blob/df874432b9a17b547acb24a01d3491839e6a6b69/solr/core/src/java/org/apache/solr/response/DocsStreamer.java#L156

e.g. when pass in the field list as

sdoc = convertLuceneDocToSolrDoc(doc, rctx.getSearcher().getSchema(), fnames)

and just iterate through fnames, there is a significant performance boost in our case.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

SOLR-11891.patch.BAD
15/Mar/18 02:21
14 kB
Chris M. Hostetter
SOLR-11891.patch
15/Mar/18 23:36
17 kB
Chris M. Hostetter
DocsStreamer.java.diff
05/Feb/18 19:46
2 kB
wei wang

Issue Links

incorporates

SOLR-12107 [child] doc transformer used w/o uniqueKey in 'fl' fails with NPE unless documentCache is enabled

Closed

SOLR-12108 raw transformers ([json] and [xml]) drop the field value if wt is not a match and documentCache is not enabled

Closed

Activity

People

Assignee:: Chris M. Hostetter

Reporter:: wei wang

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 23/Jan/18 21:54

Updated:: 08/Jun/19 15:14

Resolved:: 19/Mar/18 21:32