Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10967

Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: namenode
    • Labels:

      Description

      Large production clusters are likely to have heterogeneous nodes in terms of storage capacity, memory, and CPU cores. It is not always possible to proportionally ingest data into DataNodes based on their remaining storage capacity. Therefore it's possible for a subset of DataNodes to be much closer to full capacity than the rest.

      This heterogeneity is most likely rack-by-rack – i.e. m whole racks of low-storage nodes and n whole racks of high-storage nodes. So It'd be very useful if we can lower the chance for those near-full DataNodes to become destinations for the 2nd and 3rd replicas.

        Attachments

        1. HDFS-10967.03.patch
          26 kB
          Zhe Zhang
        2. HDFS-10967.02.patch
          27 kB
          Zhe Zhang
        3. HDFS-10967.01.patch
          11 kB
          Zhe Zhang
        4. HDFS-10967.00.patch
          4 kB
          Zhe Zhang

          Issue Links

            Activity

              People

              • Assignee:
                zhz Zhe Zhang
                Reporter:
                zhz Zhe Zhang
              • Votes:
                1 Vote for this issue
                Watchers:
                17 Start watching this issue

                Dates

                • Created:
                  Updated: