Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22471

SQLListener consumes much memory causing OutOfMemoryError

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete CommentsDelete
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.2.1
    • SQL, Web UI
    • Spark 2.2.0, Linux

    • Important

    Description

      SQLListener may grow very large when Spark runs complex multi-stage requests. The listener tracks metrics for all stages in _stageIdToStageMetrics hash map. SQLListener has some means to cleanup this hash map regularly, but this is not enough. Precisely, the method trimExecutionsIfNecessary ensures that _stageIdToStageMetrics does not have metrics for very old data; this method runs on each execution completion.
      However, if an execution has many stages, SQLListener keeps adding new entries to _stageIdToStageMetrics without calling trimExecutionsIfNecessary. The hash map may grow to enormous size.
      Strictly speaking, it is not a memory leak, because finally trimExecutionsIfNecessary cleans the hash map. However, the driver program has high odds to crash with OutOfMemoryError (and it does).

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tashoyan Arseniy Tashoyan Assign to me
            tashoyan Arseniy Tashoyan
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 72h
              72h
              Remaining:
              Remaining Estimate - 72h
              72h
              Logged:
              Time Spent - Not Specified
              Not Specified

              Slack

                Issue deployment