Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-28082

oldWALs naming can be incompatible with HBase backup

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.6.0, 3.0.0-beta-1
    • backup&restore
    • None
    • Encountered on HBase a2e7d2015e9f603e46339d0582e29a86843b9324 (branch-2), running in Kubernetes.

    • Reviewed

    Description

      I am testing HBase backup functionality, and noticed following warning when running "hbase backup create incremental ...":

       

      23/09/13 15:44:10 WARN org.apache.hadoop.hbase.backup.util.BackupUtils: Skip log file (can't parse): hdfs://hdfsns/hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312

      It appears in my setup, the oldWALs are indeed given names that seem to break "ServerName.valueOf(s)" in "BackupUtils#parseHostFromOldLog(Path p)":

       

       

      user@hadoop-client-769bc9946-xqrt2:/$ hdfs dfs -ls hdfs:///hbase/hbase/oldWALs
      Found 42 items
      -rw-r--r--   1 hbase hbase     775421 2023-09-13 13:14 hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694609957984$masterlocalwal$
      -rw-r--r--   1 hbase hbase      26059 2023-09-13 13:29 hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694610867894$masterlocalwal$
      ...
      -rw-r--r--   1 hbase hbase     242479 2023-09-13 14:16 hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312
      -rw-r--r--   1 hbase hbase       4364 2023-09-13 14:16 hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694610188654
      ...
      -rw-r--r--   1 hbase hbase      70802 2023-09-13 13:15 hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694609970025.meta
      -rw-r--r--   1 hbase hbase         93 2023-09-13 13:04 hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694610188627.meta
      ...

      I'd say this is not a bug in the backup system, but rather in whatever gives the oldWAL files its name. I'm however not that familiar with HBase code to find where these files are created. Any pointers are appreciated.

      Given that this causes some logs to be missed during backup, I guess this can lead to data loss in a backup restore?

       

      Attachments

        Issue Links

          Activity

            People

              janvanbesien Jan Van Besien
              dieterdp_ng Dieter De Paepe
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: