Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-18621

CryptoOutputStream::close leak when encrypted zones + quota exceptions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 3.3.1, 3.3.2, 3.3.5, 3.3.3, 3.3.4
    • 3.4.0, 3.3.5
    • fs
    • Reviewed
    • Patch, Important

    Description

      I would like to report an issue with a resource leak (DFSOutputStream objects) when using the (java) hadoop-hdfs-client

      And specifically (at least in my case) when there is a combination of:

      • encrypted zones
      • quota space exceptions (DSQuotaExceededException)

      As you know, when encrypted zones are in play, when calling fs.create(path) in the hadoop-hdfs-client it will return a HdfsDataOutputStream stream object which wraps a CryptoOutputStream object which then wraps a DFSOutputStream object.

      Even though my code is correctly calling stream.close() on the above I can see from debugging that the underlying DFSOutputStream objects are being leaked. 

      Specifically I see the DFSOutputStream objects being leaked in the filesBeingWritten map in DFSClient.  (i.e. the DFSOutputStream objects remain in the map even though I've called close() on the stream object).

      I suspect this is due to a bug in CryptoOutputStream::close

        @Override                                                                                                   
        public synchronized void close() throws IOException {                                                       
          if (closed) {                                                                                             
            return;                                                                                                 
          }                                                                                                         
          try {                                                                                                     
            flush();                                                                                                
            if (closeOutputStream) {                                                                                
              super.close();                                                                                        
              codec.close();                                                                                        
            }                                                                                                       
            freeBuffers();                                                                                          
          } finally {                                                                                               
            closed = true;                                                                                          
          }                                                                                                         
        }

      ... whereby if flush() throws (observed in my case when a DSQuotaExceededException exception is thrown due to quota exceeded) then the super.close() on the underlying DFSOutputStream is skipped.

      In my case I had a space quota set up on a given directory which is also in an encrypted zone and so each attempt to create and write to a file failed and leaked as above.

      I have attached a speculative patch (hadoop_cryto_stream_close_try_finally.diff) which simply wraps the flush() in a try .. finally.  The patch resolves the problem in my testing.

      Thanks.

      Attachments

        Issue Links

          Activity

            People

              cdougan Colm Dougan
              cdougan Colm Dougan
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: