Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16493

[SBN Read]When fast path tail enabled, standby or observer namenode may read uncommitted data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • journal-node, namanode
    • None

    Description

      Although fast path tail use quorum read to pull edit log, it seem like can read uncommitted data in some corner case.

      Here is an example. Suppose we have three JN, their init state is:

       

      epoch 1
      JN1 [1-3](in-progress)
      JN2 [1-3](in-progress)
      JN3 [1-4](in-progress)
      Note that, in epoch 1 txid 1-3 was committed, and txid 4 not.
      

      When a failover occur, if a new writer cannot contact to JN3 for network partition, and finish the recovery stage, and write a new txid 4 in epoch 2, which value not equal to JN3's.

       

      epcho 2
      JN1 [1-3](finalized) [4-4](inprogress)
      JN2 [1-3](finalized) [4-4](inprogress)
      JN3 [1-4](inprogress)
      Note that, in JN3 txid4's value not equal to other JN.
      

       

      Now there is a read namenode to pull edits, and it contact to JN3 and JN2, it got majority response. But it got logs of same length but different content.And no more information to choose which log is right. If we choose JN3, we got meta data corruption.

      There is a test example patch [^example.patch] for running and debug.

      For fix it i think we should add finalized state to GetJournaledEditsResponseProto, so we can discard the fault log.

      Attachments

        1. exapmle.v1.patch
          25 kB
          liutongwei

        Issue Links

          Activity

            People

              Unassigned Unassigned
              liutongwei liutongwei
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: