Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24854

Incremental Materialized view refresh in presence of update/delete operations

    XMLWordPrintableJSON

Details

    Description

      Current implementation of incremental Materialized can not be used if any of the Materialized view source tables has update or delete operation since the last rebuild. In such cases a full rebuild should be performed.

      Steps to enable incremental rebuild:
      1. Introduce a new virtual column to mark a row deleted
      2. Execute the query in the view definition
      2.a. Add filter to each table scan in order to pull only the rows from each source table which has a higher writeId than the writeId of the last rebuild - this is already implemented by current incremental rebuild
      2.b Add row is deleted virtual column to each table scan. In join nodes if any of the branches has a deleted row the result row is also deleted.

      We should distinguish two type of view definition queries: with and without Aggregate.

      3.a No aggregate path:
      Rewrite the plan of the full rebuild to a multi insert statement with two insert branches. One branch to insert new rows into the materialized view table and the second one for insert deleted rows to the materialized view delete delta.

      3.b Aggregate path: TBD

      Prerequisite:
      source tables haven't compacted since the last MV revuild

      Attachments

        Issue Links

          Activity

            People

              kkasa Krisztian Kasa
              kkasa Krisztian Kasa
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 20m
                  2h 20m