Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-6339

Ability to Disable Partition Deletes during Clean

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • cleaning, table-service

    Description

      We recently experienced a large data loss in one of our largest Hudi tables. We observed that entire partitions in our table were being deleted but we were initially unsure why. After a deep analysis of the code, we traced it to the Cleaning service, specifically the logic which decides whether a given partition is empty. We are running Hudi 0.12.3 so this is the link to the code I'm referencing:

      https://github.com/apache/hudi/blob/release-0.12.3/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java#L370

       

      The root cause of our issue is that we are using the Metadata Table (MDT) and it became inconsistent with the underlying filesystem somehow (we are unsure of the root cause). We did not have any auditing for the MDT to alert us to inconsistencies so the MDT remained in this state for a considerable amount of time.

      Because of the inconsistencies, there were many partitions that existed on disk but did not exist in the MDT. A full, non-incremental clean was run on the table which caused the Cleaner to scan all partitions in the table and compare what was on disk with what was in the MDT. The cleaner mistakenly considered all of the partitions that were on disk to be empty (even though they were not) and proceeded to perform a recursive delete of all those partitions.

      Due to the high-risk nature of partition deletes, I propose a configuration which allows Hudi operators to disable partition deletes on critical tables where deleting entire partitions is never desired. This aligns with all of our time-series Hudi tables.

       

      NOTE: I see that there are some improvements to the logic which determines an empty partition in the Master branch (not yet released). These improvements are great but due to the high-risk nature of these partition deletes, I still propose that an addition configuration be added so that users can fully disable partition deletes against tables that should never experience those.

      Recent changes: https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java#L392

      Attachments

        Activity

          People

            dave_hagman Dave Hagman
            dave_hagman Dave Hagman
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: