Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-905

CachingUserSimilarity and CachingItemSimilarity have wrong (far to small) default maxSizes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • 0.5
    • None
    • None

    Description

      I am currently tuning my recommender discussed here: http://thread.gmane.org/gmane.comp.apache.mahout.user/10433.

      As a first step I wrapped my LogLikelihoodSimilarity with an CachingUserSimilarity. I used Java Visual VM to profile the calls. I recognized that I didn't get any performance benefits. So I had a look into the code.

      Actually line 47 this(similarity, dataModel.getNumItems()); in CachingUserSimilarity.java is wrong. If we want to cache all item similarities we need a cache with (dataModel.getNumItems()*(dataModel.getNumItems()-1))/2 possible entries.

      I am now doing this in the constructor. I attached a patch to adjust this in the trunk build.

      Attachments

        1. CachingSimilariyAdjustedDefaultSize.patch
          1 kB
          Manuel Blechschmidt

        Activity

          People

            srowen Sean R. Owen
            manuel_b Manuel Blechschmidt
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 0.5h
                0.5h
                Remaining:
                Remaining Estimate - 0.5h
                0.5h
                Logged:
                Time Spent - Not Specified
                Not Specified