Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-20046 Reconsider the implementation for serial replication
  3. HBASE-20271

ReplicationSourceWALReader.switched should use the file name instead of the path object directly

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0-alpha-1, 2.1.0
    • Replication
    • None
    • Reviewed

    Description

      2018-03-24 08:29:29,965 ERROR [RS_REFRESH_PEER-regionserver/ubuntu:0-0.replicationSource,2.replicationSource.shipperubuntu%2C35197%2C1521851267085,2] helpers.MarkerIgnoringBase(159): ***** ABORTING region server ubuntu,35197,1521851267085: Failed to operate on replication queue *****
      org.apache.hadoop.hbase.replication.ReplicationException: Failed to set log position (serverName=ubuntu,35197,1521851267085, queueId=2, fileName=ubuntu%2C35197%2C1521851267085.1521851344947, position=2533)
      	at org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.setWALPosition(ZKReplicationQueueStorage.java:237)
      	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.lambda$9(ReplicationSourceManager.java:488)
      	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.abortWhenFail(ReplicationSourceManager.java:455)
      	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:488)
      	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:232)
      	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:134)
      	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:104)
      Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
      	at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
      	at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1006)
      	at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:910)
      	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.multi(RecoverableZooKeeper.java:663)
      	at org.apache.hadoop.hbase.zookeeper.ZKUtil.multiOrSequential(ZKUtil.java:1670)
      	at org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.setWALPosition(ZKReplicationQueueStorage.java:235)
      	... 6 more
      
      2018-03-24 08:29:30,025 ERROR [RpcServer.default.FPBQ.Fifo.handler=2,queue=0,port=37509] master.MasterRpcServices(508): Region server ubuntu,35197,1521851267085 reported a fatal error:
      ***** ABORTING region server ubuntu,35197,1521851267085: Failed to operate on replication queue *****
      Cause:
      org.apache.hadoop.hbase.replication.ReplicationException: Failed to set log position (serverName=ubuntu,35197,1521851267085, queueId=2, fileName=ubuntu%2C35197%2C1521851267085.1521851344947, position=2533)
      	at org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.setWALPosition(ZKReplicationQueueStorage.java:237)
      	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.lambda$9(ReplicationSourceManager.java:488)
      	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.abortWhenFail(ReplicationSourceManager.java:455)
      	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:488)
      	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:232)
      	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:134)
      	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:104)
      Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
      	at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
      	at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:1006)
      	at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:910)
      	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.multi(RecoverableZooKeeper.java:663)
      	at org.apache.hadoop.hbase.zookeeper.ZKUtil.multiOrSequential(ZKUtil.java:1670)
      	at org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.setWALPosition(ZKReplicationQueueStorage.java:235)
      	... 6 more
      

      Attachments

        1. HBASE-20271.patch
          1 kB
          Duo Zhang

        Activity

          People

            zhangduo Duo Zhang
            zhangduo Duo Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: