Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12046

Add profile counter for scan range queueing time on disk queues

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • Backend
    • ghx-label-7

    Description

      I saw a profile showing the total time of a ScanNode is dominanted by ScannerIoWaitTime. However, the hdfs openFileTime and readTime are all small. No other counters can explain why ScannerIoWaitTime is long.

      - DecompressionTime: 964.648ms
      - InactiveTotalTime: 0.000ns
      - MaterializeTupleTime: 2s132ms
      - ScannerIoWaitTime: 11s641ms          <-- Dominants the total time
      - TotalRawHdfsOpenFileTime: 14.501ms
      - TotalRawHdfsReadTime: 1s374ms
      - TotalReadThroughput: 29.94 MB/secĀ 
      - TotalTime: 15s865ms

      After some debug, I realize the time is spent in queuing in the disk queue. If the scanner is consuming data faster than the disk queue threads can read, scan ranges will be queueing in the disk queues. The queueing time is not counted in either TotalRawHdfsOpenFileTime or TotalRawHdfsReadTime, but is counted in ScannerIoWaitTime. We should add profile counter for the queueing time on disk queues to better explain ScannerIoWaitTime.

      May also consider to add metrics for counting the size in the scanner which queues the data is reading from.

      Attachments

        Activity

          People

            baggio000 Yida Wu
            stigahuang Quanlong Huang
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: