Description
Right now, every connection in HDFS is based on RPC/IPC which is based on TCP. Connection is re-used based on ConnectionID, which includes RpcTimeout as part of the key to identify a connection. The consequence is if we want to use a different rpc timeout between two hosts, this would create different TCP connections.
A use case which motivated us to consider UDP is getHAServiceState() in ObserverReadProxyProvider. We'd like getHAServiceState() to time out with a much smaller timeout threshold and move to probe next Namenode. To support this, we used an executorService and set a timeout for the task in HDFS-17030. This implementation can be improved by using UDP to query HAServiceState. getHAServiceState() does not have to be very reliable, as we can always fall back to the active.
Another motivation is it seems 5~10% of RPC calls hitting our active/observers are GetHAServiceState(). If we can move them off to the UDP server, that can hopefully improve RPC latency.
Attachments
Attachments
Issue Links
- requires
-
HADOOP-18981 Move oncrpc/portmap from hadoop-nfs to hadoop-common
- Resolved