Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-16015

Usability - VerifyReplication performance is too slow

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • Usability
    • None

    Description

      I see VerifyReplication is too slow in Geo replication cluster, then I dig into the code where default Input scanner caching set as 1 for target cluster request.
      This value should be optimal or could be exposed in usage command.
      -Dhbase.mapreduce.scan.cachedrows=100

      TableInputFormat.java
      public static final String SCAN_CACHEDROWS = "hbase.mapreduce.scan.cachedrows";
      
      VerifyReplication.java
      Configuration conf = context.getConfiguration();
      final Scan scan = new Scan();        scan.setCaching(conf.getInt(TableInputFormat.SCAN_CACHEDROWS, 1));
      

      If agree, then I will add this line into printUsage method as shown below,

      VerifyReplication.java
      System.err.println("For performance consider the following option, Input scanner caching for source to target cluster request\n"
                  + "-Dhbase.mapreduce.scan.cachedrows=100");
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            karthikShva123@gmail.com Karthik Palanisamy
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: