[FLINK-35291] Improve the ROW data deserialization performance of DebeziumEventDeserializationScheme - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: 1.20.0
Fix Version/s: 1.20.0
Component/s: Flink CDC
Labels:
- pull-request-available

Description

We are doing performance testing on Flink cdc 3.0 and found through the arthas profile that there is a significant performance bottleneck in the deserialization of row data. The main problem lies in the String. format in the BinaryRecordDataGenerator class, so we have made simple performance optimizations.

test environment:

flink: 1.20-SNAPSHOT master
flink-cdc: 3.2-SNAPSHOT master
1CU minicluster mode

source:
  type: mysql
  hostname: localhost
  port: 3308
  username: root
  password: 123456
  tables: test.user_behavior
  server-id: 5400-5404
  #server-time-zone: UTC
  scan.startup.mode: earliest-offset
  debezium.poll.interval.ms: 10

sink:
  type: values
  name: Values Sink
  materialized.in.memory: false
  print.enabled: false

pipeline:
  name: Sync MySQL Database to Values
  parallelism: 1

before optimization: 3.5w/s

cdc-3.0-1c.html

^{Analyzing the flame chart, it can be found that approximately 24.45% of the time is spent on string.format.}

after optimization: 5w/s

cdc-3.0-1c-2.html

After optimization, 4.7%(extractBeforeDataRecord+extractAfterDataRecord) of the time is still spent on org/apache/flink/cdc/runtime/typeutils/BinaryRecordDataGenerator.<init>. Perhaps we can further optimize it.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

cdc-3.0-1c.html
05/May/24 16:21
309 kB
LiuZeshan
image-2024-05-06-00-29-34-618.png
05/May/24 16:29
576 kB
LiuZeshan
cdc-3.0-1c-2.html
05/May/24 16:36
299 kB
LiuZeshan
image-2024-05-06-00-37-16-028.png
05/May/24 16:37
583 kB
LiuZeshan

Issue Links

links to

GitHub Pull Request #3289

Activity

People

Assignee:: Unassigned

Reporter:: LiuZeshan

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 05/May/24 16:42

Updated:: 09/May/24 02:01