[ZEPPELIN-3334] Set spark.scheduler.pool to authenticated user name - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 0.9.0
Fix Version/s: None
Component/s: zeppelin-interpreter, zeppelin-server, zeppelin-zengine
Labels:
None

Description

Setting spark.scheduler.pool to authenticated user name would allow to have multiple resource pools for different users when using shared Spark context / shared Spark Interpreter;

This improvement request is for "The interpreter will be instantiated Globally in shared process" Spark Interpreter mode.

Per Spark documentation, https://spark.apache.org/docs/latest/job-scheduling.html

" within each Spark application, multiple “jobs” (Spark actions) may be running concurrently if they were submitted by different threads
... /skip/
threads. By “job”, in this section, we mean a Spark action (e.g. save, collect) and any tasks that need to run to evaluate that action. Spark’s scheduler is fully thread-safe and supports this use case to enable applications that serve multiple requests (e.g. queries for multiple users).
... /skip/
Without any intervention, newly submitted jobs go into a default pool, but jobs’ pools can be set by adding the spark.scheduler.pool “local property” to the SparkContext in the thread that’s submitting them. "

Notice that setting spark.scheduler.pool to authenticated user name has to be done in a separate thread - assuming Zeppelin internally has a separate thread for each separate authenticated user..

Attachments

Issue Links

is related to

ZEPPELIN-3563 Add pool to paragraph property that use spark interpreter

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Ruslan Dautkhanov

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 14/Mar/18 23:08

Updated:: 28/Jun/18 20:47