[HIVE-17923] 'cluster by' should not be needed for a bucketed table - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Duplicate
Affects Version/s: 3.0.0
Fix Version/s: None
Component/s: None
Labels:
None

Target Version/s:

3.0.0

Description

given

CREATE TABLE over10k_orc_bucketed(t tinyint,
           si smallint,
           i int,
           b bigint,
           f float,
           d double,
           bo boolean,
           s string,
           ts timestamp,
           `dec` decimal(4,2),
           bin binary) CLUSTERED BY(si) INTO 4 BUCKETS STORED AS ORC;

insert into over10k_orc_bucketed select * from over10k

produces 1 data file (bucket 0).  It should produce 4 based on input data.

insert into over10k_orc_bucketed select * from over10k cluster by si

does the right thing.

acid_vectorization_original.q has the full script (~~HIVE-17458~~)

Attachments

Issue Links

blocks

HIVE-17458 VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

Closed

is duplicated by

HIVE-18157 Vectorization : Insert in bucketed table is broken with vectorization

Closed

Activity

People

Assignee:: Deepak Jaiswal

Reporter:: Eugene Koifman

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 27/Oct/17 15:58

Updated:: 07/Dec/17 20:26

Resolved:: 07/Dec/17 20:11