Description
What happened:
When setting io.file.buffer.size to a large number, BufferedIOStatisticsOutputStream in Hcommon throws an out-of-memory exception due to inappropriate checking and handling.
The config is used to initialize a file system by passing it as one of the parameters bufferSize.
Buggy code:
In RawLocalFileSystem.java
private FSDataOutputStream create(Path f, boolean overwrite, boolean createParent, int bufferSize, short replication, long blockSize, Progressable progress, FsPermission permission) throws IOException { ... return new FSDataOutputStream(new BufferedIOStatisticsOutputStream( createOutputStreamWithMode(f, false, permission), bufferSize, true), <<--- creates a BufferedIOStatisticsOutputStream with bufferSize, often set to config io.file.buffer.size statistics); }
In BufferedIOStatisticsOutputStream.java:
public class BufferedIOStatisticsOutputStream extends BufferedOutputStream implements IOStatisticsSource, Syncable, StreamCapabilities { ... public BufferedIOStatisticsOutputStream( final OutputStream out, final int size, final boolean downgradeSyncable) { super(out, size); <<--- init the BufferedOutputStream with a huge buffer size ... }
StackTrace:
java.lang.OutOfMemoryError: Java heap space at java.base/java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:75) at org.apache.hadoop.fs.statistics.BufferedIOStatisticsOutputStream.<init>(BufferedIOStatisticsOutputSt ream.java:78) at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:428) at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:413) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1175) at org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset(ContractTestUtils.java:183) at org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset(ContractTestUtils.java:152) at org.apache.hadoop.fs.contract.AbstractContractRenameTest.expectRenameUnderFileFails(AbstractContract RenameTest.java:335) ...
Reproduce:
(1) Set io.file.buffer.size to a large value, e.g., 2112001717
(2) Run a simple test that exercises this parameter, e.g. org.apache.hadoop.fs.contract.rawlocal.TestRawlocalContractRename#testRenameFileUnderFile