Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-21931

Slow compaction for tiny tables

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.1.0
    • None
    • None

    Description

      I observed the issue in Impala development environment when (major) compacting insert_only transactional tables in Hive. The compaction could take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The actual work was done much earlier, the new base file was correctly written to HDFS, and Hive seemed to wait without doing any work.

      The compactions are started manually, hive.compactor.initiator.on=false to avoid "surprise compaction" during tests.

      hive.compactor.abortedtxn.threshold=1000
      hive.compactor.check.interval=300s
      hive.compactor.cleaner.run.interval=5000ms
      hive.compactor.compact.insert.only=true
      hive.compactor.crud.query.based=false
      hive.compactor.delta.num.threshold=10
      hive.compactor.delta.pct.threshold=0.1
      hive.compactor.history.reaper.interval=2m
      hive.compactor.history.retention.attempted=2
      hive.compactor.history.retention.failed=3
      hive.compactor.history.retention.succeeded=3
      hive.compactor.initiator.failed.compacts.threshold=2
      hive.compactor.initiator.on=false
      hive.compactor.max.num.delta=500
      hive.compactor.worker.threads=4
      hive.compactor.worker.timeout=86400s
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              csringhofer Csaba Ringhofer
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: