Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35212

Spark Streaming LocationStrategy should provide a random option that mapping kafka partitions randomly to spark executors

    XMLWordPrintableJSON

Details

    Description

      There are three LocationStrategy: PreferBrokers, PreferConsistent, PreferFixed. I got a scenario that I need a random one. There are plenty of topic partitions that are varies from each other with different records inside. And I have a lot of executors. PreferBrokers does not help here. PreferConsistent will make things worse that some executor will always get heavy tasks. PreferFixed does not help too, because it is fixed, neither to say I have to create a mapping manually.

      A random LocationStrategy should dispatch a topic partition to different executors in different window. This would balance the load among spark executors.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tiehexue Wang Yuan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 2h
                2h
                Remaining:
                Remaining Estimate - 2h
                2h
                Logged:
                Time Spent - Not Specified
                Not Specified