Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16950

Gap in edits after -initializeSharedEdits

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • None
    • None
    • journal-node, namenode
    • None

    Description

      Namenode failed in the production cluster when JN role is migrated. 

      ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
      java.io.IOException: There appears to be a gap in the edit log.  We expected txid xxxxxx, but got txid xxxxxx. 

      InitializeSharedEdits issued as part of the role migration step. Note, no checkpoint is performed in the past few hours. 

      InitializeSharedEdits created a new log segment from the edit_inprogres transaction and deleted all old transactions. 

      My ask here is to delete any edit transaction older than the fimage transaction. But currently, it deletes all transactions and no check is enforced in JNStorage#format(). 

      Attachments

        Activity

          People

            Unassigned Unassigned
            kpalanisamy Karthik Palanisamy
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: