[SPARK-32141] Repartition leads to out of memory - ASF JIRA

Attach files

Attach Screenshot

Add vote

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: 2.4.4
Fix Version/s: None
Component/s: EC2
Labels:
None

Description

We have an application that does aggregation on 7 columns. In order to avoid shuffles we thought of doing repartition on those 7 columns. It works well with 1 to 4tb of data. When it gets over 4Tb, it fails with OOM or disk space.

Do we have a better approach to reduce the shuffle ? For our biggest dataset, the spark job never ran with repartition. We are out of options.

We do have a 24 node cluster with r5.24X machines and 1TB of disk. Our shuffle partition is set to 6912.

Attachments

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Unassigned

Reporter:: Lekshmi Nair

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 01/Jul/20 00:29

Updated:: 01/Jul/20 00:29

Agile

View on Board

Repartition leads to out of memory

Details

Description

Attachments

Attachments

Activity

People

Dates

Agile

Slack

Issue deployment