Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-3334

Set spark.scheduler.pool to authenticated user name

    XMLWordPrintableJSON

Details

    Description

      Setting spark.scheduler.pool to authenticated user name would allow to have multiple resource pools for different users when using shared Spark context / shared Spark Interpreter;

      This improvement request is for "The interpreter will be instantiated Globally in shared process" Spark Interpreter mode.
       
      Per Spark documentation, https://spark.apache.org/docs/latest/job-scheduling.html 
       

      within each Spark application, multiple “jobs” (Spark actions) may be running concurrently if they were submitted by different threads 
      ... /skip/
      threads. By “job”, in this section, we mean a Spark action (e.g. savecollect) and any tasks that need to run to evaluate that action. Spark’s scheduler is fully thread-safe and supports this use case to enable applications that serve multiple requests (e.g. queries for multiple users).
      ... /skip/
      Without any intervention, newly submitted jobs go into a default pool, but jobs’ pools can be set by adding the spark.scheduler.pool “local property” to the SparkContext in the thread that’s submitting them.    "

      Notice that setting spark.scheduler.pool to authenticated user name has to be done in a separate thread - assuming Zeppelin internally has a separate thread for each separate authenticated user.. 

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              Tagar Ruslan Dautkhanov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: