Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-17238

Setting the value of "dfs.blocksize" too large will cause HDFS to be unable to write to files

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 3.3.6
    • None
    • hdfs
    • None

    Description

      My hadoop version is 3.3.6, and I use the Pseudo-Distributed Operation.

      core-site.xml like below.

      <configuration>
        <property>
              <name>fs.defaultFS</name>
              <value>hdfs://localhost:9000</value>
          </property>
          <property>
              <name>hadoop.tmp.dir</name>
              <value>/home/hadoop/Mutil_Component/tmp</value>
          </property>
         
      </configuration>

      hdfs-site.xml like below.

      <configuration>
         <property>
              <name>dfs.replication</name>
              <value>1</value>
          </property>
      <property>
              <name>dfs.blocksize</name>
              <value>1342177280000</value>
          </property>
         
      </configuration>

      And then format the namenode, and start the hdfs. HDFS is running normally.

      hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ bin/hdfs namenode -format
      xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx(many info)
      hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ sbin/start-dfs.sh
      Starting namenodes on [localhost]
      Starting datanodes
      Starting secondary namenodes [hadoop-Standard-PC-i440FX-PIIX-1996] 

      Finally, use dfs to place a file. 

      bin/hdfs dfs -mkdir -p /user/hadoop
      bin/hdfs dfs -mkdir input
      bin/hdfs dfs -put etc/hadoop/*.xml input 

      Discovering Exception Throwing.

      2023-10-19 14:56:34,603 WARN hdfs.DataStreamer: DataStreamer Exception
      org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hadoop/input/capacity-scheduler.xml._COPYING_ could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
              at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2350)
              at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2989)
              at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:912)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:595)
              at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
              at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621)
              at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)
              at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227)
              at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1094)
              at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1017)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3048)        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1567)
              at org.apache.hadoop.ipc.Client.call(Client.java:1513)
              at org.apache.hadoop.ipc.Client.call(Client.java:1410)
              at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258)
              at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139)
              at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:531)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
              at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
              at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
              at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
              at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
              at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1088)
              at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1915)
              at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1717)
              at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:713)
      put: File /user/hadoop/input/capacity-scheduler.xml._COPYING_ could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 1 node(s) are excluded in this operation. 

       

      Analyze

      The error message implies that the HDFS system was unable to replicate the file to the minimum number of nodes required. When the value of the parameter "dfs.blocksize" is set too large, it will take up more space, thus limiting the number of block copies in the distributed system.

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            fujx ECFuzz
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: