Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
2.1.1
-
None
-
RHEL spark-2.4.5-bin-hadoop2.7 for carbon 2.1.1
-
Important
Description
Hi Team ,
We are doing a POC with Carbondata using MV .
Our MV doesnot contain AVG function as we wanted to utilize the feature of incremental refresh.
But with incremetnal refresh , we noticed the MV doesnot aggregate value correctly.
If a row is inserted , it creates another row in MV instead of adding incremental value .
As a result no. of rows in MV are almost same as raw table.
This doesnot happen with full refresh MV.
Below is the data in MV with 3 rows :
scala> carbon.sql("select * from fact_365_1_eutrancell_21_30_minute").show()
--------------------------------------------------------------------------------------------------------------------------------------
fact_365_1_eutrancell_21_tags_id | fact_365_1_eutrancell_21_metric | ts | sum_value | min_value | max_value | fact_365_1_eutrancell_21_ts2 |
--------------------------------------------------------------------------------------------------------------------------------------
ff6cb0f7-fba0-413... | eUtranCell.HHO.X2... | 2020-09-25 06:30:00 | 5412.6810000000005 | 31.345 | 4578.112 | 2020-09-25 05:30:00 |
ff6cb0f7-fba0-413... | eUtranCell.HHO.X2... | 2020-09-25 05:30:00 | 1176.7035 | 392.2345 | 392.2345 | 2020-09-25 05:30:00 |
ff6cb0f7-fba0-413... | eUtranCell.HHO.X2... | 2020-09-25 06:00:00 | 58.112 | 58.112 | 58.112 | 2020-09-25 05:30:00 |
--------------------------------------------------------------------------------------------------------------------------------------
Below , i am inserting data for 6th hour, and it should add incremental values to 6th hour row of MV.
Note the data being inserted ; columns which are part of groupby clause are having same values as existing data.
scala> carbon.sql("insert into fact_365_1_eutrancell_21 values ('2020-09-25 06:05:00','eUtranCell.HHO.X2.InterFreq.PrepAttOut','ff6cb0f7-fba0-4134-81ee-55e820574627',118.112,'2020-09-25 05:30:00')").show()
21/06/28 16:01:31 AUDIT audit: {"time":"June 28, 2021 4:01:31 PM IST","username":"root","opName":"INSERT INTO","opId":"7332282307468267","opStatus":"START"}
21/06/28 16:01:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch one more time.
21/06/28 16:01:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch one more time.
21/06/28 16:01:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch one more time.
21/06/28 16:01:33 AUDIT audit: {"time":"June 28, 2021 4:01:33 PM IST","username":"root","opName":"INSERT INTO","opId":"7332284066443156","opStatus":"START"}
[Stage 40:=====================================================>(199 + 1) / 200]21/06/28 16:01:44 WARN CarbonOutputIteratorWrapper: try to poll a row batch one more time.
21/06/28 16:01:44 WARN CarbonOutputIteratorWrapper: try to poll a row batch one more time.
21/06/28 16:01:44 WARN CarbonOutputIteratorWrapper: try to poll a row batch one more time.
21/06/28 16:01:44 AUDIT audit: {"time":"June 28, 2021 4:01:44 PM IST","username":"root","opName":"INSERT INTO","opId":"7332284066443156","opStatus":"SUCCESS","opTime":"11343 ms","table":"default.fact_365_1_eutrancell_21_30_minute","extraInfo":{}}
21/06/28 16:01:44 AUDIT audit: {"time":"June 28, 2021 4:01:44 PM IST","username":"root","opName":"INSERT INTO","opId":"7332282307468267","opStatus":"SUCCESS","opTime":"13137 ms","table":"default.fact_365_1_eutrancell_21","extraInfo":{}}
----------
Segment ID |
----------
8 |
----------
Below we can see it has added another row of 2020-09-25 06:00:00 .
Note: All values of columns which are part of groupby caluse have same value.
This means there should have been single row for 2020-09-25 06:00:00 .
scala> carbon.sql("select * from fact_365_1_eutrancell_21_30_minute").show(1000,false)
-------------------------------------------------------------------------------------------------------------------------------------------------
fact_365_1_eutrancell_21_tags_id | fact_365_1_eutrancell_21_metric | ts | sum_value | min_value | max_value | fact_365_1_eutrancell_21_ts2 |
-------------------------------------------------------------------------------------------------------------------------------------------------
ff6cb0f7-fba0-4134-81ee-55e820574627 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | 2020-09-25 06:30:00 | 5412.6810000000005 | 31.345 | 4578.112 | 2020-09-25 05:30:00 |
ff6cb0f7-fba0-4134-81ee-55e820574627 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | 2020-09-25 05:30:00 | 1176.7035 | 392.2345 | 392.2345 | 2020-09-25 05:30:00 |
ff6cb0f7-fba0-4134-81ee-55e820574627 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | 2020-09-25 06:00:00 | 58.112 | 58.112 | 58.112 | 2020-09-25 05:30:00 |
ff6cb0f7-fba0-4134-81ee-55e820574627 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | 2020-09-25 06:00:00 | 118.112 | 118.112 | 118.112 | 2020-09-25 05:30:00 |
-------------------------------------------------------------------------------------------------------------------------------------------------
scala> carbon.sql("select * from fact_365_1_eutrancell_21").show(1000,false)
----------------------------------------------------------------------------------------------------------------
ts | metric | tags_id | value | ts2 |
----------------------------------------------------------------------------------------------------------------
2020-09-25 05:30:00 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | ff6cb0f7-fba0-4134-81ee-55e820574627 | 392.2345 | 2020-09-25 05:30:00 |
2020-09-25 05:30:00 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | ff6cb0f7-fba0-4134-81ee-55e820574627 | 392.2345 | 2020-09-25 05:30:00 |
2020-09-25 05:30:00 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | ff6cb0f7-fba0-4134-81ee-55e820574627 | 392.2345 | 2020-09-25 05:30:00 |
2020-09-25 06:30:00 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | ff6cb0f7-fba0-4134-81ee-55e820574627 | 31.345 | 2020-09-25 05:30:00 |
2020-09-25 06:40:00 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | ff6cb0f7-fba0-4134-81ee-55e820574627 | 745.112 | 2020-09-25 05:30:00 |
2020-09-25 06:50:00 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | ff6cb0f7-fba0-4134-81ee-55e820574627 | 4578.112 | 2020-09-25 05:30:00 |
2020-09-25 06:55:00 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | ff6cb0f7-fba0-4134-81ee-55e820574627 | 58.112 | 2020-09-25 05:30:00 |
2020-09-25 06:25:00 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | ff6cb0f7-fba0-4134-81ee-55e820574627 | 58.112 | 2020-09-25 05:30:00 |
2020-09-25 06:05:00 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | ff6cb0f7-fba0-4134-81ee-55e820574627 | 118.112 | 2020-09-25 05:30:00 |
----------------------------------------------------------------------------------------------------------------
after droping and creating the MV again, we can see single row with 2020-09-25 06:00:00 .
scala> carbon.sql("select * from fact_365_1_eutrancell_21_30_minute").show(1000,false)
-------------------------------------------------------------------------------------------------------------------------------------------------
fact_365_1_eutrancell_21_tags_id | fact_365_1_eutrancell_21_metric | ts | sum_value | min_value | max_value | fact_365_1_eutrancell_21_ts2 |
-------------------------------------------------------------------------------------------------------------------------------------------------
ff6cb0f7-fba0-4134-81ee-55e820574627 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | 2020-09-25 06:30:00 | 5412.6810000000005 | 31.345 | 4578.112 | 2020-09-25 05:30:00 |
ff6cb0f7-fba0-4134-81ee-55e820574627 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | 2020-09-25 05:30:00 | 1176.7035 | 392.2345 | 392.2345 | 2020-09-25 05:30:00 |
ff6cb0f7-fba0-4134-81ee-55e820574627 | eUtranCell.HHO.X2.InterFreq.PrepAttOut | 2020-09-25 06:00:00 | 176.224 | 58.112 | 118.112 | 2020-09-25 05:30:00 |
-------------------------------------------------------------------------------------------------------------------------------------------------
Please check what is the issue with incremental refresh MV.