Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-15788 Various Kubernetes integration improvements
  3. FLINK-15817

Kubernetes Resource leak while deployment exception happens

    XMLWordPrintableJSON

Details

    Description

      When we deploy a new session cluster on Kubernetes cluster, usually there are four steps to create the Kubernetes components, and the creation order is as below: internal Service -> rest Service -> ConfigMap -> JobManager Deployment.

      After the internal Service is created, any Exceptions that fail the cluster deployment progress would cause Kubernetes Resource leak, for example:

      1.  If failed to create rest Service due to service name constraint(FLINK-15816), the internal Service would not be cleaned up when the deploy progress terminates.
      2. If failed to create JobManager Deployment(a case is that jobmanager.heap.size is too small such as 512M, which is less than the default configuration value of 'containerized.heap-cutoff-min'), the internal Service, the rest Service, and the ConfigMap all leaks.

      This ticket proposes to do some clean-ups(cleans the residual Services and ConfigMap) if the cluster deployment progress terminates accidentally on the client-side.

      Attachments

        Issue Links

          Activity

            People

              felixzheng Canbin Zheng
              felixzheng Canbin Zheng
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m