Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-3279

Kafka trident spout could loose its position with EARLIEST or LATEST FirstPollOffsetStrategy

    XMLWordPrintableJSON

Details

    Description

      In KafkaTridentSpoutEmitter emitPartitionBatch() function, when kafkaConsumer.poll(pollTimeoutMs) returns 0 records for the very first transaction where FirstPollOffsetStrategy is set to EARLIEST or LATEST, the spout fails to move to EARLIEST or LATEST, and continues from the last metadata position.

       

      The flow of events which would cause this bug :

       

      1. FirstPollOffsetStrategy set to EARLIEST or LATEST

      2. For first transaction after restart txid1 Based on link L164 ,

      The currentBatch is initialized to lastBatchMeta (which need not be null);

      3. Later in L171, the consumer seeks to "start" OR "end"

      4. Then consumer.poll(pollTimeoutMs) is called.

      5. If poll returns non 0 records , currentBatch is set to a new metadata . If poll returns 0 records,

      currentBatch is not reset ie, currentBatch is still lastBatchMeta (which need not be null)

       

      So now in transaction txid2 after txid1, isFirstPoll() returns false, and the spout continues from lastBatchMeta.

       

       

      Attachments

        Issue Links

          Activity

            People

              srdo Stig Rohde Døssing
              janithkv Janith Kaiprath Valiyalappil
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: