Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15857

Add Caller Context in Spark

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete CommentsDelete
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      Hadoop has implemented a feature of log tracing – caller context (Jira: HDFS-9184 and YARN-4349). The motivation is to better diagnose and understand how specific applications impacting parts of the Hadoop system and potential problems they may be creating (e.g. overloading NN). As HDFS mentioned in HDFS-9184, for a given HDFS operation, it's very helpful to track which upper level job issues it. The upper level callers may be specific Oozie tasks, MR jobs, hive queries, Spark jobs.

      Hadoop ecosystems like MapReduce, Tez (TEZ-2851), Hive (HIVE-12249, HIVE-12254) and Pig(PIG-4714) have implemented their caller contexts. Those systems invoke HDFS client API and Yarn client API to setup caller context, and also expose an API to pass in caller context into it.

      Lots of Spark applications are running on Yarn/HDFS. Spark can also implement its caller context via invoking HDFS/Yarn API, and also expose an API to its upstream applications to set up their caller contexts. In the end, the spark caller context written into Yarn log / HDFS log can associate with task id, stage id, job id and app id. That is also very good for Spark users to identify tasks especially if Spark supports multi-tenant environment in the future.

      Attachments

        Issue Links

        There are no Sub-Tasks for this issue.

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            WeiqingYang Weiqing Yang
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment