Description
Hi...
I'm trying to load an external textfile table into a internal orc table using hive. My process failed with the following error :
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/hive/blablabla.... could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and no node(s) are excluded in this operation.
After investigation, I saw that the quantity of "non dfs space" grows more and more, until the job fails.
Just before failing, the "non dfs used space" reaches 54.GB on each datanode. I still have space in "remaining DFS".
Here the dfsadmin report just before the issue :
[hdfs@hadoop-01 data]$ hadoop dfsadmin -report
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Configured Capacity: 475193597952 (442.56 GB)
Present Capacity: 290358095182 (270.42 GB)
DFS Remaining: 228619903369 (212.92 GB)
DFS Used: 61738191813 (57.50 GB)
DFS Used%: 21.26%
Under replicated blocks: 38
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Live datanodes (3):
Name: 192.168.3.36:50010 (hadoop-04.XXXXX.local)
Hostname: hadoop-04.XXXXX.local
Decommission Status : Normal
Configured Capacity: 158397865984 (147.52 GB)
DFS Used: 20591481196 (19.18 GB)
Non DFS Used: 61522602976 (57.30 GB)
DFS Remaining: 76283781812 (71.04 GB)
DFS Used%: 13.00%
DFS Remaining%: 48.16%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 182
Last contact: Tue Mar 24 10:56:05 CET 2015
Name: 192.168.3.35:50010 (hadoop-03.XXXXX.local)
Hostname: hadoop-03.XXXXX.local
Decommission Status : Normal
Configured Capacity: 158397865984 (147.52 GB)
DFS Used: 20555853589 (19.14 GB)
Non DFS Used: 61790296136 (57.55 GB)
DFS Remaining: 76051716259 (70.83 GB)
DFS Used%: 12.98%
DFS Remaining%: 48.01%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 184
Last contact: Tue Mar 24 10:56:05 CET 2015
Name: 192.168.3.37:50010 (hadoop-05.XXXXX.local)
Hostname: hadoop-05.XXXXX.local
Decommission Status : Normal
Configured Capacity: 158397865984 (147.52 GB)
DFS Used: 20590857028 (19.18 GB)
Non DFS Used: 61522603658 (57.30 GB)
DFS Remaining: 76284405298 (71.05 GB)
DFS Used%: 13.00%
DFS Remaining%: 48.16%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 182
Last contact: Tue Mar 24 10:56:05 CET 2015
I was expected to find a temporary space used within my filesystem (ie /data).
I found the DFS usage under /data/hadoop/hdfs/data (19GB) but no trace of 57GB for non DFS...
[root@hadoop-05 hadoop]# df -h /data
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 148G 20G 121G 14% /data
I also checked dfs.datanode.du.reserved that is set to zero.
[root@hadoop-05 hadoop]# hdfs getconf -confkey dfs.datanode.du.reserved
0
Did I miss something ? Where is non DFS space on linux ? Why did I get this message "could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and no node(s) are excluded in this operation." knowing that datanodes were up and running with still remaining DFS space.
This error is blocking us.