Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-4251

Acquiring locks for getInputVertices and getOutputVertices is not consistent

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.10.1
    • None
    • None

    Description

      VertexImpl.getInputVertices() acquires read lock however VertexImpl.getOutputVertices() doesn't.

      We also faced with deadlock when using Tez from Hive: see container_jstack.txt

      0. Both LlapTaskSchedulerService and VertexImpl defines its own ReentrantReadWriteLock instance.
      1. Thread "LlapScheduler" acquired write lock on LlapTaskSchedulerService.lock

      LlapTaskSchedulerService.java
        protected void schedulePendingTasks() throws InterruptedException {
          Ref<TaskInfo> downgradedTask = new Ref<>(null);
          writeLock.lock();
      

      2. Thread "Dispatcher thread {Central}" acquired write lock on VertexImpl.lock

      VertexImpl.java
        public void handle(VertexEvent event) {
      ...
          try {
            writeLock.lock();
      

      3. Thread "LlapScheduler" tries acquiring read lock on VertexImpl.lock

      VertexImpl.java
        @Override
        public Map<Vertex, Edge> getInputVertices() {
          readLock.lock();
      

      but it is waiting because Thread "Dispatcher thread {Central}" holds the write lock on VertexImpl.lock

      4. Thread "Dispatcher thread {Central}" try acquire read lock on LlapTaskSchedulerService.lock

      LlapTaskSchedulerService.vaja
        @Override
        public Resource getTotalResources() {
      ...
          readLock.lock();
      

      but it is waiting because Thread "LlapScheduler" holds the write lock on LlapTaskSchedulerService.lock

      Attachments

        1. TEZ-4251.1.patch
          0.7 kB
          Krisztian Kasa
        2. container_jstack.txt
          1.28 MB
          Krisztian Kasa

        Activity

          People

            kkasa Krisztian Kasa
            kkasa Krisztian Kasa
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: