[HIVE-11685] Restarting Metastore kills Compactions - store Hadoop job id in COMPACTION_QUEUE - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.0.1
Fix Version/s: None
Component/s: Metastore, Transactions
Labels:
None

Description

CompactorMR submits MR job to do compaction and waits for completion.
If the metastore need to be restarted, it will kill in-flight compactions.

I ideally we'd want to add job ID to the COMPACTION_QUEUE table (and include that in SHOW COMPACTIONS) and poll for it or register a callback so that the job survives Metastore restart

Also,
when running revokeTimedoutWorker() make sure to use this JobId to kill the job is it's still running.
Alternatively, if it's still running, maybe just a assign a new worker_id and let it continue to run.

Attachments

Issue Links

depends upon

HIVE-12832 RDBMS schema changes for HIVE-11388

Closed

relates to

HIVE-11388 Allow ACID Compactor components to run in multiple metastores

Closed

HIVE-15337 Enhance Show Compactions output with JobId and start time for "attempted" state

Closed

Sub-Tasks

Thrift and schema changes for HIVE-11685

Resolved

Alan Gates

Activity

People

Assignee:: Unassigned

Reporter:: Eugene Koifman

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 28/Aug/15 20:56

Updated:: 28/Jul/21 14:13