[IMPALA-12350] Daemon fails to initialize large catalog - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Workaround
Affects Version/s: Impala 4.2.0
Fix Version/s: None
Component/s: None
Labels:
None

Epic Color:
ghx-label-2

Description

When the statestored catalog topic is large enough (>2gb) daemons fail to restart and get stuck in a loop:

I0808 13:07:17.702653 3633556 Frontend.java:1618] Waiting for local catalog to be initialized, attempt: 2068

The statestored reports errors as follows:

I0808 13:07:05.587296 2134270 thrift-util.cc:196] TSocket::write_partial() send() <Host: gs1-hdp-data70 Port: 23000>: Broken pipe
I0808 13:07:05.587356 2134270 client-cache.h:362] RPC Error: Client for gs1-hdp-data70:23000 hit an unexpected exception: write() send(): Broken pipe, type: N6apache6thrift9transport19TTransportExceptionE, rpc: N6impala20TUpdateStateResponseE, send: not done
I0808 13:07:05.587365 2134270 client-cache.cc:174] Broken Connection, destroy client for gs1-hdp-data70:23000

If this happens we are forced to restart statestore and thus the whole cluster, meaning that we can't tolerate failure from even a single daemon.

Interestingly the catalog topic increased significantly after upgrading from 3.4.0 to 4.2.0 - from ~800mb to ~3.4gb. Invalidate/refresh operations also became significantly slower (~10ms -> 5s).

Probably related to thrift_rpc_max_message_size? but I see the maximum value is 2gb.

Attachments

Issue Links

duplicates

IMPALA-13020 catalog-topic updates >2GB do not work due to Thrift's max message size

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Saulius Valatka

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 08/Aug/23 13:16

Updated:: 20/Apr/24 02:46

Resolved:: 20/Apr/24 02:46