Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-7273

JHS: make sure that Kerberos relogin is performed when KDC becomes offline then online again

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.10.0, 3.2.1, 3.1.3
    • 3.4.0
    • jobhistoryserver
    • None

    Description

      In JHS, if the KDC goes offline, the IPC layer does try to relogin, but it's not always enough. You have to wait for 60 seconds for the next retry. In the meantime, if the KDC comes back, the following error might occur:

      2020-04-09 03:27:52,075 DEBUG ipc.Server (Server.java:processSaslToken(1952)) - Have read input token of size 708 for processing by saslServer.evaluateResponse()
      2020-04-09 03:27:52,077 DEBUG ipc.Server (Server.java:saslProcess(1829)) - javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Invalid argument (400) - Cannot find key of appropriate type to decrypt AP REP - AES128 CTS mode with HMAC SHA1-96)]
              at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:199)
      ...
      

      When this happens, JHS has to be restarted.

      Attachments

        1. MAPREDUCE-7273-002.patch
          2 kB
          Peter Bacsko
        2. MAPREDUCE-7273-001.patch
          2 kB
          Peter Bacsko

        Activity

          People

            pbacsko Peter Bacsko
            pbacsko Peter Bacsko
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: