Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14841 Replication - Phase 2
  3. HIVE-16813

Incremental REPL LOAD should load the events in the same sequence as it is dumped.

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.1.0
    • 3.0.0
    • Hive, repl

    Description

      Currently, incremental REPL DUMP use $dumpdir/<eventID> to dump the metadata and data files corresponding to the event. The event is dumped in the same sequence in which it was generated.

      Now, REPL LOAD, lists the directories inside $dumpdir using listStatus and sort it using compareTo algorithm of FileStatus class which doesn't check the length before sorting it alphabetically.
      Due to this, the event-100 is processed before event-99 and hence making the replica database non-sync with source.

      Need to use a customized compareTo algorithm to sort the FileStatus.

      Attachments

        1. HIVE-16813.02.patch
          17 kB
          Sankar Hariappan
        2. HIVE-16813.01.patch
          13 kB
          Sankar Hariappan

        Issue Links

          Activity

            People

              sankarh Sankar Hariappan
              sankarh Sankar Hariappan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: