Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27025

Speed up toLocalIterator

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.3.3
    • None
    • Spark Core
    • None

    Description

      Method toLocalIterator fetches the partitions to the driver one by one. However, as far as I can see, any required computation for the yet-to-be-fetched-partitions is not kicked off until it is fetched. Effectively only one partition is being computed at the same time.

      Desired behavior: immediately start calculation of all partitions while retaining the download-a-partition at a time behavior.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              erikvanoosten Erik van Oosten
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: