[HBASE-19834] Signalling server-hosted-clients to abort retries - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: 2.0.0-beta-2, 2.0.0
Component/s: None
Labels:
None

Description

A few recent flakey tests have been variations on the server-hosted-client retrying against a server or region that is never going to show up – usually because cluster is being shutdown. One example is client stuck, retrying to update hbase:meta with change in region or table state but hbase:meta is down. Another is ~~HBASE-19794~~ where the test hangs because backup Master is trying to become active and as part of the startup, it is trying to read table state from hbase:meta but hbase:meta is not available; it has been put down as part of the cluster shutdown.

One difficulty is that the master main thread can get hung-up by the client retries (in some cases the client retries are in-lined with the main thread so it is 'blocked'); it is no longer available to receive cluster shutdown or other event types (e.g. see ~~HBASE-19794~~). Some of our startup needs to be refactored moved into our run method rather than done as some big single-threaded startup as happens now in Master. We need this also for the ~~HBASE-19831~~ work.

Attachments

Issue Links

is related to

HBASE-19831 Regions on the Master Redux

Resolved

relates to

HBASE-19830 [AMv2] RPCs while holding (Region) Locks (to update hbase:meta with region state)

Open

Activity

People

Assignee:: Unassigned

Reporter:: Michael Stack

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 20/Jan/18 20:11

Updated:: 21/Mar/18 22:23

Resolved:: 22/Jan/18 23:22