Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
2.5.0, 2.6.0
-
None
Description
The Rack Aware documentation references a rack-topology.sh script which has two small flaws;
1) From 2.x.x the default config dir is ..etc/hadoop not ..etc/hadoop/conf
2) When configuring DN to rack IDs in the rack_topology.data file if hostnames are used then the rack-topology.sh script returns the prefixed rack ID but the balancer and fsck report omit the rack ID and only return one single rack (IP addresses in the data file work fine).
(e.g: when using hostnames
rack_topology.data
------------------------
datanode0 01
..
grep NetworkTopology logs/hadoop-hduser-namenode-NameNode0.log
--------------------------------------------------------------------------------------------
2014-08-27 10:29:52,518 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /LAB/rack/192.168.0.12:50010
hdfs fsck /
-------------
Number of data-nodes: 3
Number of racks: 1)
(e.g. when using IP addresses:
rack_topology.data
-----------------------
192.168.0.10 01
..
grep NetworkTopology logs/hadoop-hduser-namenode-NameNode0.log
-----------------------
2014-08-27 11:14:22,796 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /LAB/rack_01/192.168.0.10:50010
hdfs fsck /
-------------
Number of data-nodes: 3
Number of racks: 2)