Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1042

spark cleans all java broadcast variables when it hits the spark.cleaner.ttl

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.8.0, 0.8.1, 0.9.0
    • 0.9.2
    • Java API, Spark Core

    Description

      When setting spark.cleaner.ttl, spark performs the cleanup on time - but it cleans all broadcast variables, not just the ones that are older than the ttl. This creates an exception when the next mapPartitions runs because it cannot find the broadcast variable, even when it was created immediately before running the task.

      Our temp workaround - not set the ttl and suffer from an ongoing memory leak (forces a restart).

      We are using JavaSparkContext and our broadcast variables are Java HashMaps.

      Attachments

        Activity

          People

            qqsun8819 OuyangJin
            sliwo Tal Sliwowicz
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: