Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-242 [RFC-12] Support Efficient bootstrap of large parquet datasets to Hudi
  3. HUDI-417

Refactor HoodieWriteClient so that commit logic can be shareable by both bootstrap and normal write operations

    XMLWordPrintableJSON

Details

    Description

       

      Basic Code Changes are present in the fork : https://github.com/bvaradar/hudi/tree/vb_bootstrap

       

      The current implementation of HoodieBootstrapClient has duplicate code for committing bootstrap. 

      https://github.com/bvaradar/hudi/blob/vb_bootstrap/hudi-client/src/main/java/org/apache/hudi/bootstrap/HoodieBootstrapClient.java 

       

      We can have an independent PR which would move these commit functionality from HoodieWriteClient to a new base class AbstractHoodieWriteClient which HoodieBootstrapClient can inherit.

       

      Attachments

        Issue Links

          Activity

            People

              vbalaji Balaji Varadarajan
              vbalaji Balaji Varadarajan
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m