Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-28184

Tailing the WAL is very slow if there are multiple peers.

    XMLWordPrintableJSON

Details

    Description

      Noticed in one of our production clusters which has 4 peers.

      Due to sudden ingestion of data, the size of log queue increased to a peak of 506. We have configured log roll size to 256 MB. Most of the edits in the WAL were from a table for which replication is disabled. 

      So all ReplicationSourceWALReader thread had to do was to replay the WAL and NOT replicate them. Still it took 12 hours to drain the queue.

      Took few jstacks and found that ReplicationSourceWALReader was waiting to acquire rollWriterLock here

      "regionserver/<rs>,1" #1036 daemon prio=5 os_prio=0 tid=0x00007f44b374e800 nid=0xbd7f waiting on condition [0x00007f37b4d19000]
         java.lang.Thread.State: WAITING (parking)
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for  <0x00007f3897a3e150> (a java.util.concurrent.locks.ReentrantLock$FairSync)
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:837)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:872)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1202)
              at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:228)
              at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
              at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.getLogFileSizeIfBeingWritten(AbstractFSWAL.java:1102)
              at org.apache.hadoop.hbase.wal.WALProvider.lambda$null$0(WALProvider.java:128)
              at org.apache.hadoop.hbase.wal.WALProvider$$Lambda$177/1119730685.apply(Unknown Source)
              at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
              at java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1361)
              at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
              at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499)
              at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
              at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
              at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152)
              at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
              at java.util.stream.ReferencePipeline.findAny(ReferencePipeline.java:536)
              at org.apache.hadoop.hbase.wal.WALProvider.lambda$getWALFileLengthProvider$2(WALProvider.java:129)
              at org.apache.hadoop.hbase.wal.WALProvider$$Lambda$140/1246380717.getLogFileSizeIfBeingWritten(Unknown Source)
              at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.readNextEntryAndRecordReaderPosition(WALEntryStream.java:260)
              at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:172)
              at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101)
              at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:222)
              at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:157)
      

       All the peers will contend for this lock during every batch read.
      Look at the code snippet below. We are guarding this section with rollWriterLock if we are replicating the active WAL file. But in our case we are NOT replicating active WAL file but still we acquire this lock only to return OptionalLong.empty();

        /**
         * if the given {@code path} is being written currently, then return its length.
         * <p>
         * This is used by replication to prevent replicating unacked log entries. See
         * https://issues.apache.org/jira/browse/HBASE-14004 for more details.
         */
        @Override
        public OptionalLong getLogFileSizeIfBeingWritten(Path path) {
          rollWriterLock.lock();
          try {
             ...
             ...
          } finally {
            rollWriterLock.unlock();
          }
      

      We can check the size of log queue and if it is greater than 1 then we can return early without acquiring the lock.

      Attachments

        Issue Links

          Activity

            People

              shahrs87 Rushabh Shah
              shahrs87 Rushabh Shah
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: