Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3299

Stress test failures: Couldn't open transport for impala-stress-cdh5-trunk2-5.vpc.cloudera.com:22000

    XMLWordPrintableJSON

Details

    Description

      The stress test will often print many of the following error:

      11:34:00 Process Process-84:
      11:34:00 Traceback (most recent call last):
      11:34:00   File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
      11:34:00     self.run()
      11:34:00   File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
      11:34:00     self._target(*self._args, **self._kwargs)
      11:34:00   File "tests/stress/concurrent_select.py", line 613, in _start_single_runner
      11:34:00     raise Exception("Query failed: %s" % str(report.non_mem_limit_error))
      11:34:00 Exception: Query failed: 
      11:34:00 Couldn't get a client for impala-stress-cdh5-trunk2-5.vpc.cloudera.com:22000	Reason: Couldn't open transport for impala-stress-cdh5-trunk2-5.vpc.cloudera.com:22000 (connect() failed: Connection timed out)
      

      e.g. http://sandbox.jenkins.cloudera.com/job/Impala-Stress-Test-EC2-CDH5-trunk/621/console

      Usually this will fail the job, but occasionally it will recover and keep going (although the error may show up again).

      It's hard to catch it exactly when this happens, but I've seen 40+ queries running on the impalads after this occurs.

      We need to investigate exactly what is causing this, and then decide what to do about it. This is currently failing a large proportion of stress jobs.

      Attachments

        Issue Links

          Activity

            People

              henryr Henry Robinson
              skye Skye Wanderman-Milne
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: