Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-26225

let hbase.mapreduce.bulkload.assign.sequenceNumbers take effect in SecureBulkLoadManager.secureBulkLoadHFiles

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • None
    • None
    • Performance
    • None

    Description

      HBASE-10958 Call Flush before BulkLoad to obtain the latest sequenceID to prevent data loss during replay. 'hbase.mapreduce.bulkload.assign.sequenceNumbers' controls whether to flush before BulkLoad, but we pass true to whether to flush in SecureBulkLoadManager. If we bulkload frequently we flush a lot of small files. Can we make 'hbase.mapreduce.bulkload.assign.sequenceNumbers' work in SecureBulkLoadManager? This passes -1 to sequenceId, we won't loss data.

      SecureBulkLoadManager.java. 

      secureBulkLoadHFiles

      // code placeholder
      return region.bulkLoadHFiles(familyPaths, true, new SecureBulkLoadListener(fs, bulkToken, conf), request.getCopyFile(), clusterIds, request.getReplicate());
      

      Hregion.java

      // code placeholder
      public Map<byte[], List<Path>> bulkLoadHFiles(Collection<Pair<byte[], String>> familyPaths,
          boolean assignSeqId, BulkLoadListener bulkLoadListener, boolean copyFile,
          List<String> clusterIds, boolean replicate)
      

       

      Attachments

        1. SecureBulkLoadManager.patch
          2 kB
          xi chaomin

        Activity

          People

            Unassigned Unassigned
            xichaomin xi chaomin
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: