Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19968

Use a cached instance of KafkaProducer for writing to kafka via KafkaSink.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0
    • Structured Streaming

    Description

      KafkaProducer is thread safe and an instance can be reused for writing every batch out. According to Kafka docs, this sort of usage is encouraged. It has impact on performance too.

      On an average an addBatch operation takes 25ms with this patch. It takes 250+ ms without this patch.

      Results of benchmark results, posted on github PR.

      Attachments

        Activity

          People

            prashant Prashant Sharma
            prashant Prashant Sharma
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: