Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12731

Investigate profile guided optimization of LLVM library

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Impala 4.4.0
    • None
    • Infrastructure
    • None
    • ghx-label-9

    Description

      LLVM compilation can be a performance critical path for query execution. Various benchmarks have confirmed that profile guided optimization can speed up various LLVM binaries including Clang.

      https://llvm.org/docs/HowToBuildWithPGO.html has this sentence:

      PGO (Profile-Guided Optimization) allows your compiler to better optimize code for how it actually runs. Users report that applying this to Clang and LLVM can decrease overall compile time by 20%.

      https://github.com/llvm/llvm-project/issues/63486 gives some results from applying this to other LLVM binaries such as clangd, lldb, clang-tidy, etc.

      It sounds like this could speed up Impala's LLVM compilation. We could try this by building LLVM with profiling enabled, linking it into Impala, running small scale queries that do codegen (TPC-H, TPC-DS) to produce profiling data, then rebuilding LLVM with that profiling data.

      This links are referring to compilation using Clang, but we usually use GCC to build LLVM. GCC has profile guided optimization that should be able to achieve similar results, or we can try building the toolchain with Clang.

      Attachments

        Activity

          People

            Unassigned Unassigned
            joemcdonnell Joe McDonnell
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: