Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
2.6.0
-
None
Description
We are rolling out the server side TLS settings to all of our QA clusters. This has mostly gone fine, except on 1 cluster. Most clusters, including this one have a sampled nettyDirectMemory usage of about 30-100mb. This cluster tends to get bursts of traffic, in which case it would typically jump to 400-500mb. Again this is sampled, so it could have been higher than that. When we enabled SSL on this cluster, we started seeing bursts up to at least 4gb. This exceeded our -XX:MaxDirectMemorySize, which caused OOM's and general chaos on the cluster.
We've gotten it under control a little bit by setting -Dorg.apache.hbase.thirdparty.io.netty.maxDirectMemory and -Dorg.apache.hbase.thirdparty.io.netty.tryReflectionSetAccessible. We've set netty's maxDirectMemory to be approx equal to (-XX:MaxDirectMemorySize - BucketCacheSize - ReservoirSize). Now we are seeing netty's own OutOfDirectMemoryError, which is still causing pain for clients but at least insulates the other components of the regionserver.
We're still digging into exactly why this is happening. The cluster clearly has a bad access pattern, but it doesn't seem like SSL should increase the memory footprint by 5-10x like we're seeing.
Attachments
Attachments
Issue Links
- blocks
-
HBASE-28009 Document netty memory tunings
- Open
- Testing discovered
-
HBASE-28008 Add support for tcnative
- Resolved
- links to