Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-17286

Add UDP as a transfer protocol for HDFS

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • hdfs
    • None

    Description

      Right now, every connection in HDFS is based on RPC/IPC which is based on TCP. Connection is re-used based on ConnectionID, which includes RpcTimeout as part of the key to identify a connection. The consequence is if we want to use a different rpc timeout between two hosts, this would create different TCP connections. 

      A use case which motivated us to consider UDP is getHAServiceState() in ObserverReadProxyProvider. We'd like getHAServiceState() to time out with a much smaller timeout threshold and move to probe next Namenode. To support this, we used an executorService and set a timeout for the task in HDFS-17030. This implementation can be improved by using UDP to query HAServiceState. getHAServiceState() does not have to be very reliable, as we can always fall back to the active.

      Another motivation is it seems 5~10% of RPC calls hitting our active/observers are GetHAServiceState(). If we can move them off to the UDP server, that can hopefully improve RPC latency.

       

       

       

      Attachments

        1. observer.png
          85 kB
          Xing Lin
        2. active.png
          84 kB
          Xing Lin

        Issue Links

          Activity

            People

              Unassigned Unassigned
              xinglin Xing Lin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: