Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-37503 Improve SparkSession/PySpark SparkSession startup
  3. SPARK-37727

Show ignored confs & hide warnings for conf already set in SparkSession.builder.getOrCreate

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.0
    • 3.3.0
    • SQL
    • None

    Description

      Currently, SparkSession.builder.getOrCreate() is too noisy even when duplicate configurations are set. And users cannot tell which configurations are to fix. See the example below:

      ./bin/spark-shell --conf spark.abc=abc
      
      import org.apache.spark.sql.SparkSession
      spark.sparkContext.setLogLevel("DEBUG")
      SparkSession.builder.config("spark.abc", "abc").getOrCreate
      
      ...
      21:12:40.601 [main] WARN  org.apache.spark.sql.SparkSession - Using an existing SparkSession; some spark core configurations may not take effect.
      

      This is strait forward when there are few configurations but it is difficult for users to figure out when there are too many configurations especially when these configurations are defined in property files like spark-default.conf that is sometimes maintained separately by system admins.

      See also https://github.com/apache/spark/pull/34757#discussion_r769248275

      Attachments

        Issue Links

          Activity

            People

              gurwls223 Hyukjin Kwon
              gurwls223 Hyukjin Kwon
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: