Uploaded image for project: 'PyLucene'
  1. PyLucene
  2. PYLUCENE-2

Memory leak when searching in real time reader

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Invalid
    • ubuntu 9.10, python 2.6, pylucene 3.0

    Description

      Our codes is following:
      We have 31 index dirs in /tmp (there are about 5 million records in our indexs), we want to real time search, so we use the writer.getReader() to get the real time reader.
      Then we did the search repeatly, finally java 'out of memory' issue will happen(about 10 minites).

      initVM(CLASSPATH,initialheap='100m',maxheap='100m')
      keywordQuery = QueryParser(Version.LUCENE_CURRENT,"content", StandardAnalyzer(Version.LUCENE_CURRENT)).parse("when AND you")
      writers = []
      for i in range(1,32):
      dir = os.path.join("/tmp",str)

      luceneDir = SimpleFSDirectory(File(dir))

      writer = IndexWriter(luceneDir, StandardAnalyzer(Version.LUCENE_CURRENT), False,IndexWriter.MaxFieldLength.LIMITED)
      writer.setRAMBufferSizeMB(32.0)
      writer.setUseCompoundFile(True)
      writer.setMergeFactor(10)
      writers.append(writer)

      while True:
      searchersList = []
      readers = []
      for writer in writers:
      reader = writer.getReader()
      searcher = IndexSearcher(reader)
      searchersList.append(searcher)
      readers.append(reader)
      multiSearcherInstance = MultiSearcher(searchersList)
      docs = multiSearcherInstance.search(keywordQuery,IndexerCons.TOP_DOC_NUMBER).scoreDocs

      multiSearcherInstance.close()
      for searcher in searchersList:
      searcher.close()
      for reader in readers:
      reader.close()

      Then we use the normal reader (directly open from the dirs) instead of the real time reader, the test is OK, no 'out of memory' issue.
      The bug maybe come from java lucene, i don't sure.

      Attachments

        Activity

          People

            Unassigned Unassigned
            fxjwind feng xiaojie
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 336h
                336h
                Remaining:
                Remaining Estimate - 336h
                336h
                Logged:
                Time Spent - Not Specified
                Not Specified