Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9579

Reuse lucene FieldType in createField flow during ingestion

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 7.0
    • 7.0
    • Schema and Analysis
    • This has been primarily tested on Windows 8 and Windows Server 2012 R2

    Description

      During ingestion createField in FieldType is being called for each field on each document. For the subclasses of FieldType without their own implementation of createField the lucene version of FieldType is created to be stored along with the value. However the lucene FieldType object is identical when created from the same SchemaField. In testing ingestion of one million rows with 22 field each we were creating 22 million lucene FieldType objects when only 22 are needed. Solr should lazily initialize a lucene FieldType for each SchemaField and reuse them for future ingestion. Not only does this relieve memory usage but also relieves significant pressure on the gc.

      There are also subclasses of Solr FieldType which create separate Lucene FieldType for stored fields instead of reusing the static in StoredField.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            johncall John Call
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 2h
              2h
              Remaining:
              Remaining Estimate - 2h
              2h
              Logged:
              Time Spent - Not Specified
              Not Specified

              Slack

                Issue deployment