Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18454

Changes to improve Nearest Neighbor Search for LSH

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • None
    • None
    • ML

    Description

      We all agree to do the following improvement to Multi-Probe NN Search:
      (1) Use approxQuantile to get the hashDistance threshold instead of doing full sort on the whole dataset

      Currently we are still discussing the following:
      (1) What hashDistance (or Probing Sequence) we should use for MinHash
      (2) What are the issues and how we should change the current Nearest Neighbor implementation

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yunn Yun Ni
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: