[HBASE-16499] slow replication for small HBase clusters - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.5.0, 2.0.0
Component/s: Replication
Labels:
None

Hadoop Flags:

Reviewed
Release Note:
Changed the default value for replication.source.ratio from 0.1 to 0.5. Which means now by default 50% of the total RegionServers in peer cluster(s) will participate in replication.

Description

For small clusters 10-20 nodes we recently observed that replication is progressing very slowly when we do bulk writes and there is lot of lag accumulation on AgeOfLastShipped / SizeOfLogQueue. From the logs we observed that the number of threads used for shipping wal edits in parallel comes from the following equation in HBaseInterClusterReplicationEndpoint

int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1),
replicationSinkMgr.getSinks().size());
...
for (int i=0; i<n; i++)

{ entryLists.add(new ArrayList<HLog.Entry>(entries.size()/n+1)); <-- batch size }

...
for (int i=0; i<entryLists.size(); i++)

{ ..... // RuntimeExceptions encountered here bubble up and are handled in ReplicationSource pool.submit(createReplicator(entryLists.get(i), i)); <-- concurrency futures++; }

}

maxThreads is fixed & configurable and since we are taking min of the three values n gets decided based replicationSinkMgr.getSinks().size() when we have enough edits to replicate

replicationSinkMgr.getSinks().size() is decided based on
int numSinks = (int) Math.ceil(slaveAddresses.size() * ratio);
where ratio is this.ratio = conf.getFloat("replication.source.ratio", DEFAULT_REPLICATION_SOURCE_RATIO);

Currently DEFAULT_REPLICATION_SOURCE_RATIO is set to 10% so for small clusters of size 10-20 RegionServers the value we get for numSinks and hence n is very small like 1 or 2. This substantially reduces the pool concurrency used for shipping wal edits in parallel effectively slowing down replication for small clusters and causing lot of lag accumulation in AgeOfLastShipped. Sometimes it takes tens of hours to clear off the entire replication queue even after the client has finished writing on the source side.

We are running tests by varying replication.source.ratio and have seen multi-fold improvement in total replication time (will update the results here). I wanted to propose here that we should increase the default value for replication.source.ratio also so that we have sufficient concurrency even for small clusters. We figured it out after lot of iterations and debugging so probably slightly higher default will save the trouble.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HBASE-16499.patch
29/Mar/18 07:02
7 kB
Ashish Singhi
HBASE-16499.patch
03/Apr/18 12:33
7 kB
Ashish Singhi
HBASE-16499-addendum.patch
04/Apr/18 05:35
1 kB
Ashish Singhi

Issue Links

is related to

HBASE-19148 Reevaluate default values of configurations

Resolved

Activity

People

Assignee:: Ashish Singhi

Reporter:: Vikas Vishwakarma

Votes:: 0 Vote for this issue

Watchers:: 13 Start watching this issue

Dates

Created:: 25/Aug/16 04:32

Updated:: 15/Jan/19 05:53

Resolved:: 06/Apr/18 03:50