Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4459

Consider making ReportExecStatus() RPC execute asynchronously

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • Impala 2.7.0
    • None
    • Distributed Exec

    Description

      ReportExecStatus() calls UpdateFragmentExecStatus() which could execute for a long while under heavy load as it contends for the QueryExecState lock.

      The only return value that the sender of the RPC cares about can be returned before UpdateFragmentExecStatus() executes. Therefore, we can make the execution of this function completely asynchronous without having to worry about sending a return RPC call.

      Therefore, this doesn't introduce any new distributed failure modes that we have to worry about and is a relatively easy change.

      On running private tests on a 16-node cluster with IMPALA-4456 and this change, I noticed the number of clients created dropped from ~1500 to ~500 per node. This change will help in situations where we now hit the maximum connection limit per node.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            sailesh Sailesh Mukil
            sailesh Sailesh Mukil
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment