Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-152

Invoke preCombine in real time view by converting arrayWritable to Avro

    XMLWordPrintableJSON

Details

    Description

      There are 2 issues with the realtime input format:

       

      1. Delta records (updates) might not have the entire row change log, in such an update, we need to be able to call preCombine of the HoodieRecordPayload implementation so that we merge existing data from parquet (full row change log) with the new column being updated.
      2. In case there is some custom computation of columns in a custom implementation of the HoodieRecordPayload, that will be missed in the realtime input format right now. We need to honor that by calling preCombine.

       

      Both of the above are use-cases for power users who implement their own custom record. Since this is not common, this is lower priority. 

      Attachments

        Activity

          People

            nishith29 Nishith Agarwal
            nishith29 Nishith Agarwal
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: