Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25941

Random forest score decreased due to updating spark version

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 2.3.2
    • None
    • Deploy, Input/Output, ML

    Description

      Problem description

      I use different versions of spark to analyze random forest scores..

      • spark-core_2.10 and version 2.0.0
        • RandomForestsKaggle Score = 0.8978765219058574
      • spark-core_2.11 and version 2.4.0
        • RandomForestsKaggle Score = 0.8886987035251259

      Source :  https://github.com/smartscity/Kaggle_Titanic_spark

      Example github source and readme

       

      Introduce

      This case is Titanic Competitions on the Kaggle. https://www.kaggle.com/c/titanic

      Conclusion

      After upgrading the spark version(version 2.4.0), the random forest score dropped(0.01).

      Expectation

      Expect random forest score not to drop as the version upgrades.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            lyl2008dsg jack li
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: