Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-5737

Deadlock during Interpreter Creation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 0.10.1
    • 0.11.0
    • Interpreters
    • None

    Description

      I encountered the following deadlock when starting the Python interpreter.
      Triggering the deadlock is relatively simple. While starting the interpreter simply stop the interpreter via Rest-API.

      Found one Java-level deadlock:
      =============================
      "Thread-29":
        waiting to lock monitor 0x00007fde240084d8 (object 0x00000000804c7120, a org.apache.zeppelin.interpreter.LazyOpenInterpreter),
        which is held by "FIFOScheduler-interpreter_1515166446-Worker-1"
      "FIFOScheduler-interpreter_1515166446-Worker-1":
        waiting to lock monitor 0x00007fde20242928 (object 0x00000000804941e0, a org.apache.zeppelin.interpreter.InterpreterGroup),
        which is held by "pool-3-thread-8"
      "pool-3-thread-8":
        waiting to lock monitor 0x00007fde202429d8 (object 0x00000000804c71b8, a org.apache.zeppelin.spark.PySparkInterpreter),
        which is held by "FIFOScheduler-interpreter_1515166446-Worker-1"
      
      Java stack information for the threads listed above:
      ===================================================
      "Thread-29":
              at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:63)
              - waiting to lock <0x00000000804c7120> (a org.apache.zeppelin.interpreter.LazyOpenInterpreter)
              at org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(LazyOpenInterpreter.java:118)
              at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.lambda$cancel$2(RemoteInterpreterServer.java:950)
              at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$$Lambda$2428/1999550584.run(Unknown Source)
              at java.lang.Thread.run(Thread.java:748)
      "FIFOScheduler-interpreter_1515166446-Worker-1":
              at org.apache.zeppelin.interpreter.Interpreter.getInterpreterInTheSameSessionByClassName(Interpreter.java:293)
              - waiting to lock <0x00000000804941e0> (a org.apache.zeppelin.interpreter.InterpreterGroup)
              at org.apache.zeppelin.interpreter.Interpreter.getInterpreterInTheSameSessionByClassName(Interpreter.java:333)
              at org.apache.zeppelin.spark.IPySparkInterpreter.open(IPySparkInterpreter.java:57)
              - locked <0x00000000804bc9f8> (a org.apache.zeppelin.spark.IPySparkInterpreter)
              at org.apache.zeppelin.python.PythonInterpreter.open(PythonInterpreter.java:91)
              at org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:94)
              at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
              - locked <0x00000000804c71b8> (a org.apache.zeppelin.spark.PySparkInterpreter)
              - locked <0x00000000804c7120> (a org.apache.zeppelin.interpreter.LazyOpenInterpreter)
              at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:861)
              at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:769)
              at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
              at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:132)
              at org.apache.zeppelin.scheduler.FIFOScheduler.lambda$runJobInScheduler$0(FIFOScheduler.java:42)
              at org.apache.zeppelin.scheduler.FIFOScheduler$$Lambda$268/1225679228.run(Unknown Source)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      "pool-3-thread-8":
              at org.apache.zeppelin.interpreter.LazyOpenInterpreter.isOpen(LazyOpenInterpreter.java:100)
              - waiting to lock <0x00000000804c71b8> (a org.apache.zeppelin.spark.PySparkInterpreter)
              at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.close(RemoteInterpreterServer.java:496)
              - locked <0x00000000804941e0> (a org.apache.zeppelin.interpreter.InterpreterGroup)
              at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$close.getResult(RemoteInterpreterService.java:1757)
              at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$close.getResult(RemoteInterpreterService.java:1736)
              at org.apache.zeppelin.shaded.org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
              at org.apache.zeppelin.shaded.org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
              at org.apache.zeppelin.shaded.org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      
      Found 1 deadlock.
      

      Attachments

        Issue Links

          Activity

            People

              Reamer Philipp Dallig
              Reamer Philipp Dallig
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: