Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6285

Avoid printing the stack as part of DoTransmitDataRpc as it leads to burning lots of kernel CPU

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.11.0
    • Impala 2.11.0
    • None
    • ghx-label-9

    Description

      When running on 32 concurrent TPCDS queries against 20 r4.8xlarge some of the RPCs timeout but don't fail the query

      I1206 12:44:14.925405 25274 status.cc:58] RPC recv timed out: Client foo-17.domain.com:22000 timed-out during recv call.
          @           0x957a6a  impala::Status::Status()
          @          0x11dd5fe  impala::DataStreamSender::Channel::DoTransmitDataRpc()
          @          0x11ddcd4  impala::DataStreamSender::Channel::TransmitDataHelper()
          @          0x11de080  impala::DataStreamSender::Channel::TransmitData()
          @          0x11e1004  impala::ThreadPool<>::WorkerThread()
          @           0xd10063  impala::Thread::SuperviseThread()
          @           0xd107a4  boost::detail::thread_data<>::run()
          @          0x128997a  (unknown)
          @     0x7f68c5bc7e25  start_thread
          @     0x7f68c58f534d  __clone
      
      I1206 12:44:15.152775 25296 status.cc:58] RPC recv timed out: Client foo-5.domain.com:22000 timed-out during recv call.
          @           0x957a6a  impala::Status::Status()
          @          0x11dd5fe  impala::DataStreamSender::Channel::DoTransmitDataRpc()
          @          0x11ddcd4  impala::DataStreamSender::Channel::TransmitDataHelper()
          @          0x11de080  impala::DataStreamSender::Channel::TransmitData()
          @          0x11e1004  impala::ThreadPool<>::WorkerThread()
          @           0xd10063  impala::Thread::SuperviseThread()
          @           0xd107a4  boost::detail::thread_data<>::run()
          @          0x128997a  (unknown)
          @     0x7f68c5bc7e25  start_thread
          @     0x7f68c58f534d  __clone
      

      The status can be changed to expected but it is worth verifying that this timeout can be tolerated.

      Attachments

        Issue Links

          Activity

            People

              kwho Michael Ho
              drorke David Rorke
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: