Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-10052

Avoid obtaining any locks when creating/sending heartbeats

    XMLWordPrintableJSON

Details

    Description

      When NiFi creates a heartbeat to send to the coordinator, it must obtain a few locks in order to generate that heartbeat. We should avoid obtaining any read locks, write locks, or synchronized monitors, especially those that may be held for a while. Doing so can result in NiFi getting disconnected from the cluster if a write lock is held for a long time.

      Specifically, the following locks are obtained, at minimum:

      • FlowController readLock in the createHeartbeatMessage() method. Due to refactoring, this read lock is not necessary at all.
      • revisionManager.getRevisionUpdateCount() is synchronized. However, the synchronization here is not needed, as it just returns an AtomicLong.get(). This is perhaps the most important lock to avoid because any update to a component or group of components happens within revisionManager.updateRevision, which also is synchronized. So a large request like deleting thousands of components will block heartbeats from being created until this completes.
      • FlowController.getTotalFlowFileCount - this may be the most challenging to eliminate. It calls ProcessGroup.getConnections() and ProcessGroup.getProcessGroups(), which means that it must obtain the read lock of the Process Group twice - for every Process Group in the flow. We may be able to change StandardProcessGroup's connections and processGroups maps to ConcurrentHashMap's and just introduce a getQueueSize() method on ProcessGroup that can avoid having to lock so much
      • This createHeartbeatMessage() method also appears to reference FlowController's connectionStatus member variable without any locks, although it is not volatile and documentation indicates that it's guarded by read/write lock. So that needs to be addressed in order to ensure that the connectionStatus is always accurately reported.

      Attachments

        Issue Links

          Activity

            People

              s9514171 Hsin-Ying Lee
              markap14 Mark Payne
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h