Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-1381 Support cloud native Ozone deployment (Phase II)
  3. HDDS-1506

Ozone Manager can't be started with existing reverse dns

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.4.0
    • None
    • None

    Description

      I tried to start current ozone in kubernetes and ozone manager initialization is failed with the following error:

      2019-05-09 08:40:23 INFO  OzoneManager:51 - registered UNIX signal handlers for [TERM, HUP, INT]
      2019-05-09 08:40:23 WARN  ScmUtils:63 - ozone.om.db.dirs is not configured. We recommend adding this setting. Falling back to ozone.metadata.dirs instead.
      2019-05-09 08:40:24 INFO  OzoneManager:1029 - Initializing secure OzoneManager.
      2019-05-09 08:40:24 ERROR OMCertificateClient:209 - Default certificate serial id is not set. Can't locate the default certificate for this client.
      2019-05-09 08:40:24 INFO  OMCertificateClient:588 - Certificate client init case: 0
      2019-05-09 08:40:24 INFO  OMCertificateClient:55 - Creating keypair for client as keypair and certificate not found.
      2019-05-09 08:40:24 INFO  OzoneManager:1035 - Init response: GETCERT
      2019-05-09 08:40:24 INFO  OzoneSecurityUtil:103 - Adding ip:192.168.11.208,host:om-0.om.default.svc.cluster.local
      2019-05-09 08:40:24 INFO  OzoneSecurityUtil:107 - ip:127.0.0.1,host:localhost not returned.
      2019-05-09 08:40:24 ERROR OzoneManager:1421 - Incorrect om rpc address. omRpcAdd:om-0.om:9862
      2019-05-09 08:40:24 ERROR OzoneManager:888 - Failed to start the OzoneManager.
      java.lang.RuntimeException: Can't get SCM signed certificate. omRpcAdd: om-0.om:9862
      	at org.apache.hadoop.ozone.om.OzoneManager.getSCMSignedCert(OzoneManager.java:1422)
      	at org.apache.hadoop.ozone.om.OzoneManager.initializeSecurity(OzoneManager.java:1041)
      	at org.apache.hadoop.ozone.om.OzoneManager.omInit(OzoneManager.java:994)
      	at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:951)
      	at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:882)
      2019-05-09 08:40:24 INFO  ExitUtil:210 - Exiting with status 1: java.lang.RuntimeException: Can't get SCM signed certificate. omRpcAdd: om-0.om:9862
      2019-05-09 08:40:24 INFO  OzoneManager:51 - SHUTDOWN_MSG: 
      

      The root of the problem is this method in OzoneManager:

       private static void getSCMSignedCert(CertificateClient client,
            OzoneConfiguration config, OMStorage omStore) throws IOException {
         ...
          omRpcAdd = OmUtils.getOmAddress(config);
          if (omRpcAdd == null || omRpcAdd.getAddress() == null) {
            LOG.error("Incorrect om rpc add-ress. omRpcAdd:{}", omRpcAdd);
            throw new RuntimeException("Can't get SCM signed certificate. " +
                "omRpcAdd: " + omRpcAdd);
          }
      

      In My case omRpcAdd.getAddress() seems to be empty at the time of initialization as the reverse dns entry will be available only when om is started. Which is a classic chicken/egg problem: We need reverse dns for initialization but reverse dns entry is added when the container is started.

      Attachments

        Issue Links

          Activity

            People

              elek Marton Elek
              elek Marton Elek
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: