[HIVE-23703] Major QB compaction with multiple FileSinkOperators results in data loss and one original file - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 4.0.0-alpha-1
Component/s: None
Labels:
- compaction
- pull-request-available

Description

Problems

Example:

drop table if exists tbl2;
create transactional table tbl2 (a int, b int) clustered by (a) into 4 buckets stored as ORC TBLPROPERTIES('transactional'='true','transactional_properties'='default');
insert into tbl2 values(1,2),(1,3),(1,4),(2,2),(2,3),(2,4);
insert into tbl2 values(3,2),(3,3),(3,4),(4,2),(4,3),(4,4);
insert into tbl2 values(5,2),(5,3),(5,4),(6,2),(6,3),(6,4);

E.g. in the example above, bucketId=0 when a=2 and a=6.

1. Data loss
In non-acid tables, an operator's temp files are named with their task id. Because of this snippet, temp files in the FileSinkOperator for compaction tables are identified by their bucket_id.

if (conf.isCompactionTable()) {
 fsp.initializeBucketPaths(filesIdx, AcidUtils.BUCKET_PREFIX + String.format(AcidUtils.BUCKET_DIGITS, bucketId),
 isNativeTable(), isSkewedStoredAsSubDirectories);
 } else {
 fsp.initializeBucketPaths(filesIdx, taskId, isNativeTable(), isSkewedStoredAsSubDirectories);
 }

So 2 temp files containing data with a=2 and a=6 will be named bucket_0 and not 000000_0 and 000000_1 as they would normally.
In FileSinkOperator.commit, when data with a=2, filename: bucket_0 is moved from _task_tmp.-ext-10002 to _tmp.-ext-10002, it overwrites the files already there with a=6 data, because it too is named bucket_0. You can see in the logs:

 WARN [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: Target path file:.../hive/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnNoBuckets-1591107230237/warehouse/testmajorcompaction/base_0000002_v0000013/.hive-staging_hive_2020-06-02_07-15-21_771_8551447285061957908-1/_tmp.-ext-10002/bucket_00000 with a size 610 exists. Trying to delete it.

2. Results in one original file
OrcFileMergeOperator merges the results of the FSOp into 1 file named 000000_0.

Fix

1. FSOp will store data as: taskid/bucketId. e.g. 0_0/bucket_0

2. OrcMergeFileOp, instead of merging a bunch of files into 1 file named 000000_0, will merge all files named bucket_0 into one file named bucket_0, and so on.

3. MoveTask will get rid of the taskId directories if present and only move the bucket files in them, in case OrcMergeFileOp is not run.

Attachments

Issue Links

duplicates

HIVE-22474 Query based major compaction always creates only one bucket file

Resolved

is related to

HIVE-24015 Disable query-based compaction on MR execution engine

Closed

links to

GitHub Pull Request #1134

Activity

People

Assignee:: Karen Coppage

Reporter:: Karen Coppage

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 16/Jun/20 14:58

Updated:: 17/Nov/22 08:48

Resolved:: 23/Jun/20 09:17

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

1h 20m