FileLinkInputStream is an InputStream which handles the indirection of where the real HFile lives. This implementation is wrapped via FSDataInputStreamWrapper and is transparent when it's being used by a caller. Often, we have an FSDataInputStreamWrapper wrapping a FileLinkInputStream which wraps an FSDataInputStream.
The problem is that FileLinkInputStream does not implement the CanUnbuffer interface, which means that the underlying FSDataInputStream for the HFile the link refers to doesn't get unbuffer() called on it. This can cause an open Socket to hang around, as described in
Both Wellington Chevreuil and myself have run into this, each for different users. We think the commonality as to why these users saw this (but we haven't run into it on our own) is that it requires a very large snapshot to be brought into a new system. Big kudos to Esteban Gutierrez for his help in diagnosing this as well!
If this analysis is accurate, it would affect all branches.
|Remove hadoop2.6.1-hadoop-2.6.4 as supported on branch-2.0||Resolved|
|Revert HBASE-21915 from branch-2.0||Resolved|