Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-3658

Potential data inconsistency due to txns gap in committedLog when ZkDB not fully shutdown

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 3.6.0, 3.5.6
    • None
    • server
    • None

    Description

      During DIFF sync, the txns will be applied to learner's DataTree but it won't be added into the in memory committed txns cache in ZkDatabase. If this server became new leader later, and when other servers try to sync with it, it may cause data inconsistency due to part of txns are missing.

      This is not a problem if we fully shutdown the ZkDB and reload from disk, but the current behavior in 3.5 and 3.6 will not fully shutdown the DB, which is a nice optimization to reduce the unavailable time with large snapshot.

      Internally, we have another version of 'Retain DB' implementation, and we caught this issue with the digest feature we just upstreamed, and have fixed that internally. Just realized we haven't upstreamed that, and this is the Jira for that issue, will send a PR for this soon.

      Attachments

        Activity

          People

            lvfangmin Fangmin Lv
            lvfangmin Fangmin Lv
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: