Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-1508

Partition update with global index in MOR tables resulting in duplicate values during read optimized queries

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • index

    Description

      The way Hudi handles updating partition path is by locating the existing record and performing a delete on the previous partition and performing insert on new partition. In the case of Merge-on-Read tables the delete operation, and any update operation, is added as a log file. However since an insert occurs in the new partition the record is added in a parquet file. Querying using `QUERY_TYPE_READ_OPTIMIZED_OPT_VAL` fetches only parquet files and now we have the case where 2 records for given primary key are present

      Attachments

        Issue Links

          Activity

            People

              xushiyan Shiyan Xu
              ryanpife Ryan Pifer
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: