Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6340

mllib.IDF for LabelPoints

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • 1.3.0
    • None
    • MLlib
    • python 2.7.8
      pyspark
      OS: Linux Mint 17 Qiana (Cinnamon 64-bit)

    Description

      as per: http://apache-spark-user-list.1001560.n3.nabble.com/Using-TF-IDF-from-MLlib-td19429.html#a19528

      Having the IDF.fit accept LabelPoints would be useful since, correct me if i'm wrong, there currently isn't a way of keeping track of which labels belong to which documents if one needs to apply a conventional tf-idf transformation on labelled text data.

      Attachments

        Activity

          People

            Unassigned Unassigned
            kian.ho Kian Ho
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: