Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32141

Repartition leads to out of memory

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 2.4.4
    • None
    • EC2
    • None

    Description

       We  have an application that does aggregation on 7 columns. In order to avoid shuffles we thought of doing repartition on those 7 columns. It works well with 1 to 4tb of data. When it gets over 4Tb, it  fails with OOM or disk space.

       

      Do we have a better  approach to reduce the shuffle ? For our biggest dataset, the spark job never ran with repartition.  We are out of options.

       

      We do have a 24 node cluster with r5.24X machines and 1TB of disk.  Our shuffle partition is set to 6912. 

      Attachments

        Activity

          People

            Unassigned Unassigned
            lekshminair24 Lekshmi Nair
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: