Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-33683

FLIP-407 Improve Flink Client performance in interactive scenarios

    XMLWordPrintableJSON

Details

    Description

      Now there are lots of unnecessary overhead involved in submitting jobs and fetching results to a long-running flink cluster. This works well for streaming and batch job, because in these scenarios users will not frequently submit jobs and fetch result to a running cluster.

       

      But in OLAP scenario, users will continuously submit lots of short-lived jobs to the running cluster. In this situation, these overhead will have a huge impact on the E2E performance.  Here are some examples of unnecessary overhead:

      • Each `RemoteExecutor` will create a new `StandaloneClusterDescriptor` when executing a job on the same remote cluster
      • `StandaloneClusterDescriptor` will always create a new `RestClusterClient` when retrieving an existing Flink Cluster
      • Each `RestClusterClient` will create a new `ClientHighAvailabilityServices` which might contains a resource-consuming ha client(ZKClient or KubeClient) and a time-consuming leader retrieval operation
      • `RestClient` will create a new connection for every request which costs extra connection establishment time

       

      Therefore, I suggest creating this ticket and following subtasks to improve this performance. This ticket is also relates to  FLINK-25318.

      Attachments

        Issue Links

          Activity

            People

              xiangyu0xf xiangyu feng
              xiangyu0xf xiangyu feng
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: