Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.9.0
-
None
-
None
Description
Setting spark.scheduler.pool to authenticated user name would allow to have multiple resource pools for different users when using shared Spark context / shared Spark Interpreter;
This improvement request is for "The interpreter will be instantiated Globally in shared process" Spark Interpreter mode.
Per Spark documentation, https://spark.apache.org/docs/latest/job-scheduling.html
" within each Spark application, multiple “jobs” (Spark actions) may be running concurrently if they were submitted by different threads
... /skip/
threads. By “job”, in this section, we mean a Spark action (e.g. save, collect) and any tasks that need to run to evaluate that action. Spark’s scheduler is fully thread-safe and supports this use case to enable applications that serve multiple requests (e.g. queries for multiple users).
... /skip/
Without any intervention, newly submitted jobs go into a default pool, but jobs’ pools can be set by adding the spark.scheduler.pool “local property” to the SparkContext in the thread that’s submitting them. "
Notice that setting spark.scheduler.pool to authenticated user name has to be done in a separate thread - assuming Zeppelin internally has a separate thread for each separate authenticated user..
Attachments
Issue Links
- is related to
-
ZEPPELIN-3563 Add pool to paragraph property that use spark interpreter
- Closed