Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
1.5.0, 1.5.1, 1.5.2
-
Ubuntu 14.04 LTS, Oracle JDK 1.8.51 Apache tomcat 8.0.28. Spring 4
-
Important
Description
it seems that there is A SparkContext jobProgressListener memory leak.*. Bellow i describe the steps i do to reproduce that.
I have created a java webapp trying to abstractly Run some Spark Sql jobs that read data from HDFS (join them) and Write them To ElasticSearch using ES hadoop connector. After a Lot of consecutive runs i noticed that my heap space was full so i got an out of heap space error.
At the attached file
AbstractSparkJobRunner
the
public final void run(T jobConfiguration, ExecutionLog executionLog) throws Exception
runs each time an Spark Sql Job is triggered. So tried to reuse the same SparkContext for a number of consecutive runs. If some rules apply i try to clean up the SparkContext by first calling
killSparkAndSqlContext
. This code eventually runs
synchronized (sparkContextThreadLock) { if (javaSparkContext != null) { LOGGER.info("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! CLEARING SPARK CONTEXT!!!!!!!!!!!!!!!!!!!!!!!!!!!"); javaSparkContext.stop(); javaSparkContext = null; sqlContext = null; System.gc(); } numberOfRunningJobsForSparkContext.getAndSet(0); }
.
So at some point in time i suppose that if no other SparkSql job should run i should kill the sparkContext (The AbstractSparkJobRunner.killSparkAndSqlContext runs) and this should be garbage collected from garbage collector. However this is not the case, Even if in my debugger shows that my JavaSparkContext object is null see attached picture
SparkContextPossibleMemoryLeakIDEA_DEBUG.png
.
The jvisual vm shows an incremental heap space even when the garbage collector is called. See attached picture
SparkHeapSpaceProgress.png
.
The memory analyser Tool shows that a big part of the retained heap to be assigned to _jobProgressListener see attached picture
SparkMemoryAfterLotsOfConsecutiveRuns.png
and summary picture
SparkMemoryLeakAfterLotsOfRunsWithinTheSameContext.png
. Although at the same time in Singleton Service the JavaSparkContext is null.
Attachments
Attachments
Issue Links
- links to