Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-17421

Check the correctness of the calculation RpcAuthentication*

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • hdfs, metrics
    • None

    Description

      Hi

      There is a question about two hdfs metrics that arose as a result of my attempts to calculate the load on the KDC for an industrial cluster

      There are two parameters in hdfs metrics
      RpcAuthenticationSuccesses - Total number of successful authentication attempts
      RpcAuthenticationFailures - Total number of authentication failures

      I expect that any data request in the hadoop cluster will commit
      the request to KDC -> get ticket,
      the request to the NameNode
      after which the request counter should activate either +1 to the metric if successful, or +1 to the metric if unsuccessful

      However, in a test cluster where I have
      4 DataNodes and 2 NameNodes (HA), I see completely incomprehensible indicators for these metrics.

      By the way, at the same time, I noticed that the RpcAuthenticationSuccesses readings gradually increase by +1 every 30 seconds

      TEST 1
      I made sure that
      1. Only HDFS-{NN,DN,JN, ZKFC} and YARN-{RM,NM} services work
      2. All other components that were – hive, spark HistoryServer, are disabled
      3. There are no YARN jobs running and no user requests to hdfs

      At the time of testing, the value of RpcAuthenticationFailures indicators = 0
      RpcAuthenticationSuccesses = 208322

      To check the download, I run the spark-submit test - spark-examples_2.12-3.5.0.jar with the number of performers = 1
      The request was completed in 1 minute and 20 seconds
      RpcAuthenticationSuccesses = 208338

      In total, +16 was added to the original value during execution
      Let's say +2 can be attributed to the moment I wrote about above +1 every 30 seconds. But what does +14 authentications mean?

      TEST 2
      RpcAuthenticationFailures = 0
      RpcAuthenticationSuccesses = 208388

      hdfs dfs -ls /
      RpcAuthenticationFailures = 0
      RpcAuthenticationSuccesses = 208389
      Added +1. Why?
      I started kinit long before the ls/request, i.e. the metric should not have changed, I think so, but maybe I'm wrong

      TEST 3

      disabled

      • All DNs
      • Satndby NN
      • All YARN services (RM, NM)

      still running
      Three JN, ZKFC
      One NN is active

      The +1 counter continues to add +1 to the RpcAuthenticationSuccesses metric every 30 seconds

      Either I misunderstand the meaning of these indicators, or something is considered wrong
      Can you tell me how these indicators are calculated, I do not understand this or is it an error in the calculations and if I do not understand the work of these metrics, then how is it correct?

      Thank you very much

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            avs75 Chingachgook
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: