Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
We currently log memory usage and # of records processed in Spark tasks, but we should improve the methodology for how frequently we log this info. Currently we use the following code:
private long getNextLogThreshold(long currentThreshold) { // A very simple counter to keep track of number of rows processed by the // reducer. It dumps // every 1 million times, and quickly before that if (currentThreshold >= 1000000) { return currentThreshold + 1000000; } return 10 * currentThreshold; }
The issue is that after a while, the increase by 10x factor means that you have to process a huge # of records before this gets triggered.
A better approach would be to log this info at a given interval. This would help in debugging tasks that are seemingly hung.
Attachments
Attachments
Issue Links
- links to