Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-9561

SHUFFLE_SORT should only be used for order by query [Spark Branch]

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.2.0
    • Spark
    • None

    Description

      The sortByKey shuffle launches probe jobs. Such jobs can hurt performance and are difficult to control. So we should limit the use of sortByKey to order by query only.

      Attachments

        1. HIVE-9561.1-spark.patch
          3 kB
          Rui Li
        2. HIVE-9561.2-spark.patch
          90 kB
          Rui Li
        3. HIVE-9561.3-spark.patch
          75 kB
          Rui Li
        4. HIVE-9561.4-spark.patch
          65 kB
          Rui Li
        5. HIVE-9561.5-spark.patch
          59 kB
          Xuefu Zhang
        6. HIVE-9561.6-spark.patch
          66 kB
          Xuefu Zhang

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            lirui Rui Li Assign to me
            lirui Rui Li
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment