Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-31775

High-Availability not supported in kubernetes when istio enabled

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 1.16.1
    • None
    • None

    Description

      When using native kubernetes deployment mode with high-availability (HA), and when new TaskManager pod is started to process a job, the TaskManager pod will attempt to register itself to the resource manager (JobManager). the TaskManager looks up the resource manager per ip-address (akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)

       

      Nevertheless when istio is enabled, the resolution by ip address is blocked, and hence we see that the job cannot start because task manager cannot register with the resource manager:

      2023-04-10 23:24:19,752 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not resolve ResourceManager address akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.

       

      Notice that when HA is disabled, the resolution of the resource manager is made by service name and so the resource manager can be found

       

      2023-04-11 00:49:34,162 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful registration at resource manager akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_* under registration id 83ad942597f86aa880ee96f1c2b8b923.

       

      Notice in my case , it is not possible to disable istio as explained here: https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html

       

      Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , logging as separate defect as I believe the fix of FLINK-28171 won't fix this case. FLINK-28171  is about Flink Kubernetes Operator and this is about native kubernetes deployment.

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              sergiosp Sergio Sainz
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: