Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25171

After restart, StreamingContext is replaying the last successful micro-batch right before the stop

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.3.1
    • None
    • DStreams

    Description

      Please look at this line:

       

      https://github.com/apache/spark/blob/8bde4678166f5f01837919d4f8d742b89f5e76b8/streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala#L216

       

      "checkpointTime" represents a successful micro-batch. Why do we still treat it as "pending"?

       

      I think this is a bug. It cause duplicate processing.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            Haopu Wang Haopu Wang
            Saisai Shao Saisai Shao
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: