Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11448

Always assign Ozone I/O to remote thread group to improve performance

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 4.0.0
    • None
    • None
    • None
    • ghx-label-2

    Description

      IMPALA-9400 added initial support for Ozone (o3fs/ofs) by assuming all Ozone I/O as remote, which is a valid assumption.

      However, the Impala's internal logic will assign the I/O to a single local disk I/O thread , severely limiting the I/O parallelism. This is evident when running the debug build, which fails at the following check:

      Log file created at: 2022/07/18 18:15:02
      Running on machine: rhel05.ozone.cisco.local
      Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
      F0718 18:15:02.269232 105827 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000030] Check failed: !IsOzonePath(file)
      F0718 18:15:02.269235 105832 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000003] Check failed: !IsOzonePath(file) F0718 18:15:02.269273 105834 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000014] Check failed: !IsOzonePath(file)
      F0718 18:15:02.269235 105832 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000003] Check failed: !IsOzonePath(file) F0718 18:15:02.269273 105834 disk-io-mgr.cc:561] 004a5f3dd4a34435:f2e0cb9b00000014] Check failed: !IsOzonePath(file)
      

      The is_remote parameter of a scan range is always false for Ozone:

      TScanRangeParams {
        01: scan_range (struct) = TScanRange {
          01: hdfs_file_split (struct) = THdfsFileSplit {
            01: relative_path (string) = "base_1/2b4595d335caddf5-4c9efd320000000e_1114488364_data.0.parq",
            02: offset (i64) = 0,
            03: length (i64) = 15982993,
            04: partition_id (i64) = 172,
            05: file_length (i64) = 15982993,
            06: file_compression (i32) = 0,
            07: mtime (i64) = 1657968669139,
            08: is_erasure_coded (bool) = false,
            09: partition_path_hash (i32) = -343315716,
          },
        },
        02: volume_id (i32) = 65535,
        03: try_hdfs_cache (bool) = false,
        04: is_remote (bool) = false,
      

      Because Ozone does not yet have short circuit read support, I think a quick fix is to always force Ozone to use the remote I/O thread group assigned io it.

      Attachments

        1. Screen Shot 2022-07-20 at 10.35.24 AM.png
          343 kB
          Wei-Chiu Chuang
        2. Screen Shot 2022-07-20 at 11.21.52 AM.png
          345 kB
          Wei-Chiu Chuang

        Issue Links

          Activity

            People

              weichiu Wei-Chiu Chuang
              weichiu Wei-Chiu Chuang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: