[FLINK-33683] FLIP-407 Improve Flink Client performance in interactive scenarios - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Client / Job Submission, Table SQL / Client
Labels:
None

Description

Now there are lots of unnecessary overhead involved in submitting jobs and fetching results to a long-running flink cluster. This works well for streaming and batch job, because in these scenarios users will not frequently submit jobs and fetch result to a running cluster.

But in OLAP scenario, users will continuously submit lots of short-lived jobs to the running cluster. In this situation, these overhead will have a huge impact on the E2E performance. Here are some examples of unnecessary overhead:

Each `RemoteExecutor` will create a new `StandaloneClusterDescriptor` when executing a job on the same remote cluster
`StandaloneClusterDescriptor` will always create a new `RestClusterClient` when retrieving an existing Flink Cluster
Each `RestClusterClient` will create a new `ClientHighAvailabilityServices` which might contains a resource-consuming ha client(ZKClient or KubeClient) and a time-consuming leader retrieval operation
`RestClient` will create a new connection for every request which costs extra connection establishment time

Therefore, I suggest creating this ticket and following subtasks to improve this performance. This ticket is also relates to FLINK-25318.

Attachments

Issue Links

is related to

FLINK-25318 Improvement of scheduler and execution for Flink OLAP

Open

Sub-Tasks

1.	Improve the retry strategy in CollectResultFetcher	Open	Unassigned
2.	StandaloneClusterId need to distinguish different remote clusters	Open	Unassigned
3.	Reuse ClusterDescriptor in AbstractSessionClusterExecutor when executing jobs on the same cluster	Open	Unassigned
4.	Reuse RestClusterClient in StandaloneClusterDescriptor to avoid frequent thread create/destroy	Open	Unassigned
5.	Reuse Channels in RestClient to save connection establish time	Open	Unassigned
6.	Reuse ClientHighAvailabilityServices when creating RestClusterClient	In Progress	xiangyu feng
7.	Add the backoff-multiplier configuration in ExponentialBackoffRetryStrategy	Open	Unassigned

Activity

People

Assignee:: xiangyu feng

Reporter:: xiangyu feng

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 29/Nov/23 06:39

Updated:: 24/Jan/24 09:06