Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
[Reproduce steps]
- CREATE TABLE partitionthree1 (empno int, doj Timestamp, workgroupcategoryname String, deptno int, deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int, utilization int,salary int, empname String, designation String) PARTITIONED BY (workgroupcategory int) STORED AS carbondata tblproperties('sort_scope'='local_sort', 'sort_columns'='deptname,empname');
- CREATE TABLE partitionthree2 (empno int, doj Timestamp, workgroupcategoryname String, deptno int, deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int, utilization int,salary int, empname String, designation String) PARTITIONED BY (workgroupcategory int);
- LOAD DATA local inpath 'hdfs://hacluster/user/data.csv' INTO TABLE partitionthree1 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"', 'TIMESTAMPFORMAT'='dd-MM-yyyy');
- set hive.exec.dynamic.partition.mode=nonstrict;
- insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1;
insert into partitionthree2 select * from partitionthree1; - insert into partitionthree1 select * from partitionthree2;
[Expect Result]
Step 6 only launches number of tasks equal to number of nodes.
[Current Behavior]
Number of tasks far larger than number of nodes.
[Impact]
In several product sites, query performance get impact significantly.
[Initial analysis]
Insert into non partition local sort table will launch number of tasks equal to number of nodes, make partition table the same.