Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35793

Repartition before writing data source tables

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.0
    • None
    • SQL
    • None

    Description

      This umbrella ticket aim to track repartition before writing data source tables. It contains:

      1. repartition by dynamic partition column before writing dynamic partition tables.
      2. repartition before writing normal tables to avoid generating too many small files.
      3. Improve local shuffle reader.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            yumwang Yuming Wang

            Dates

              Created:
              Updated:

              Slack

                Issue deployment