Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3528

Memory of scratch batch should be transferred when closing a Parquet scanner thread.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.6.0
    • Impala 2.6.0
    • Backend
    • None

    Description

      The lifetime of a scanner thread is decoupled from that of row batches that it produces. That means that all resources associated with row batches produced by the scanner thread should be transferred to those batches.

      The bug is that we are not transferring the ownership of memory from the scratch batch to the final row batch returned in HdfsParquetScanner::Close().

      Relevant snippet:

      void HdfsParquetScanner::Close() {
       ...
        if (batch_ != NULL) {
          AttachPool(dictionary_pool_.get(), false);
          AddFinalRowBatch();
        }
        // Verify all resources (if any) have been transferred.
        DCHECK_EQ(dictionary_pool_.get()->total_allocated_bytes(), 0);
        DCHECK_EQ(scratch_batch_->mem_pool()->total_allocated_bytes(), 0);
        DCHECK_EQ(context_->num_completed_io_buffers(), 0);
       ... 
      }
      

      I noticed this bug while investigating IMPALA-3519, but unfortunately, Tim and I could not see a direct connection to IMPALA-3519 so this is probably a separate problem.

      As far as I know we have not seen any problems/crashes due to this bug - but it's definitely a bug.

      Attachments

        Issue Links

          Activity

            People

              alex.behm Alexander Behm
              alex.behm Alexander Behm
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: