Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15901

Solve the problem of DN repeated block reports occupying too many RPCs during Safemode

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      When the cluster exceeds thousands of nodes, we want to restart the NameNode service, and all DataNodes send a full Block action to the NameNode. During SafeMode, some DataNodes may send blocks to NameNode multiple times, which will take up too much RPC. In fact, this is unnecessary.
      In this case, some block report leases will fail or time out, and in extreme cases, the NameNode will always stay in Safe Mode.

      2021-03-14 08:16:25,873 [78438700] - INFO [Block report processor:BlockManager@2158] - BLOCK* processReport 0xexxxxxxxx: discarded non-initial block report from DatanodeRegistration(xxxxxxxx:port, datanodeUuid=xxxxxxxx, infoPort=xxxxxxxx, infoSecurePort=xxxxxxxx, ipcPort=xxxxxxxx, storageInfo=lv=xxxxxxxx;nsid=xxxxxxxx;c=0) because namenode still in startup phase
      2021-03-14 08:16:31,521 [78444348] - INFO [Block report processor:BlockManager@2158] - BLOCK* processReport 0xexxxxxxxx: discarded non-initial block report from DatanodeRegistration(xxxxxxxx, datanodeUuid=xxxxxxxx, infoPort=xxxxxxxx, infoSecurePort=xxxxxxxx, ipcPort=xxxxxxxx, storageInfo=lv=xxxxxxxx;nsid=xxxxxxxx;c=0) because namenode still in startup phase

      2021-03-13 18:35:38,200 [29191027] - WARN [Block report processor:BlockReportLeaseManager@311] - BR lease 0xxxxxxxxx is not valid for DN xxxxxxxx, because the DN is not in the pending set.
      2021-03-13 18:36:08,143 [29220970] - WARN [Block report processor:BlockReportLeaseManager@311] - BR lease 0xxxxxxxxx is not valid for DN xxxxxxxx, because the DN is not in the pending set.
      2021-03-13 18:36:08,143 [29220970] - WARN [Block report processor:BlockReportLeaseManager@317] - BR lease 0xxxxxxxxx is not valid for DN xxxxxxxx, because the lease has expired.
      2021-03-13 18:36:08,145 [29220972] - WARN [Block report processor:BlockReportLeaseManager@317] - BR lease 0xxxxxxxxx is not valid for DN xxxxxxxx, because the lease has expired.

      Attachments

        Issue Links

          Activity

            People

              jianghuazhu JiangHua Zhu
              jianghuazhu JiangHua Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h