Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28603

Spark Streaming application receives inconsistent input events per batch interval

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: 1.6.3
    • Fix Version/s: None
    • Component/s: Spark Core
    • Labels:
      None

      Description

      We have a 2 sec batch interval for a Spark Streaming application. The Spark is configured to receive from RabbitMQ queue and batch interval was chosen based on the resources available in the Cluster and the processing time taken without causing scheduling delays. For each run we have defined the MaxReceiverRate, BlockInterval and BackPressure enabled to deliver consistent performance for each batch.

      For example, the MaxReceiverRate was given "75", BlockInterval = 50ms and backPressure enabled, we expect for 2 sec batch - 150 msgs should be delivered for a batch to process. Most of the time we are able to achieve this performance, but except for few cases, where few batches will receive "0" events and a following batch receives say 3000 msgs (> greater than the maxReceiverRate). we are not sure of this unexpected behavior of the batch sizing, because of which our application is causing great scheduling delays because of which the application processing is unable to catch up to the incoming msg rates.

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Raja.Boyangari Raja
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 72h
                72h
                Remaining:
                Remaining Estimate - 72h
                72h
                Logged:
                Time Spent - Not Specified
                Not Specified