Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.10.0
-
ghx-label-5
Description
Multiple recent builds hit issues on test_breakpad.py. Two are specific to test_sigusr1_writes_minidump. This may be related to recent changes in IMPALA-5769. One shows issues in other parts of the test.
Here is the dumping from the test_sigusr1_writes_minidump failure:
10:59:35 __________________ TestBreakpad.test_sigusr1_writes_minidump ___________________ 10:59:35 10:59:35 self = <test_breakpad.TestBreakpad object at 0x4ce7bd0> 10:59:35 10:59:35 @pytest.mark.execute_serially 10:59:35 def test_sigusr1_writes_minidump(self): 10:59:35 """Check that when a daemon receives SIGUSR1, it writes a minidump file.""" 10:59:35 assert self.count_all_minidumps() == 0 10:59:35 self.start_cluster() 10:59:35 assert self.count_all_minidumps() == 0 10:59:35 cluster_size = self.get_num_processes('impalad') 10:59:35 self.kill_cluster(SIGUSR1) 10:59:35 # Breakpad forks to write its minidump files, wait for all the clones to terminate. 10:59:35 assert self.wait_for_num_processes('impalad', cluster_size) == cluster_size 10:59:35 assert self.wait_for_num_processes('catalogd', 1) == 1 10:59:35 assert self.wait_for_num_processes('statestored', 1) == 1 10:59:35 # Make sure impalad still answers queries. 10:59:35 client = self.create_impala_client() 10:59:35 self.execute_query_expect_success(client, "SELECT COUNT(*) FROM functional.alltypes") 10:59:35 # Kill the cluster. Sending SIGKILL will not trigger minidumps to be written. 10:59:35 self.kill_cluster(SIGKILL) 10:59:35 > self.assert_num_minidumps_for_all_daemons(cluster_size) 10:59:35 10:59:35 custom_cluster/test_breakpad.py:182: 10:59:35 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 10:59:35 10:59:35 self = <test_breakpad.TestBreakpad object at 0x4ce7bd0>, cluster_size = 3 10:59:35 base_dir = None 10:59:35 10:59:35 def assert_num_minidumps_for_all_daemons(self, cluster_size, base_dir=None): 10:59:35 self.assert_num_logfile_entries(1) 10:59:35 > assert self.count_minidumps('impalad', base_dir) == cluster_size 10:59:35 E assert 2 == 3 10:59:35 E + where 2 = <bound method TestBreakpad.count_minidumps of <test_breakpad.TestBreakpad object at 0x4ce7bd0>>('impalad', None) 10:59:35 E + where <bound method TestBreakpad.count_minidumps of <test_breakpad.TestBreakpad object at 0x4ce7bd0>> = <test_breakpad.TestBreakpad object at 0x4ce7bd0>.count_minidumps 10:59:35 10:59:35 custom_cluster/test_breakpad.py:145: AssertionError
The other failure has this output:
04:46:40 _____________________ TestBreakpad.test_minidump_creation ______________________ 04:46:40 04:46:40 self = <test_breakpad.TestBreakpad object at 0x54399d0> 04:46:40 04:46:40 @pytest.mark.execute_serially 04:46:40 def test_minidump_creation(self): 04:46:40 """Check that when a daemon crashes, it writes a minidump file.""" 04:46:40 assert self.count_all_minidumps() == 0 04:46:40 self.start_cluster() 04:46:40 assert self.count_all_minidumps() == 0 04:46:40 cluster_size = self.get_num_processes('impalad') 04:46:40 self.kill_cluster(SIGSEGV) 04:46:40 > self.assert_num_minidumps_for_all_daemons(cluster_size) 04:46:40 04:46:40 custom_cluster/test_breakpad.py:163: 04:46:40 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 04:46:40 custom_cluster/test_breakpad.py:144: in assert_num_minidumps_for_all_daemons 04:46:40 self.assert_num_logfile_entries(1) 04:46:40 custom_cluster/test_breakpad.py:151: in assert_num_logfile_entries 04:46:40 expected_count=expected_count) 04:46:40 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 04:46:40 04:46:40 self = <test_breakpad.TestBreakpad object at 0x54399d0>, level = 'INFO' 04:46:40 line_regex = 'Wrote minidump to ', expected_count = 1 04:46:40 04:46:40 def assert_impalad_log_contains(self, level, line_regex, expected_count=1): 04:46:40 """ 04:46:40 Assert that impalad log with specified level (e.g. ERROR, WARNING, INFO) contains 04:46:40 expected_count lines with a substring matching the regex. When using this method to 04:46:40 check log files of running processes, the caller should make sure that log buffering 04:46:40 has been disabled, for example by adding '-logbuflevel=-1' to the daemon startup 04:46:40 options. 04:46:40 """ 04:46:40 pattern = re.compile(line_regex) 04:46:40 found = 0 04:46:40 log_file_path = os.path.join(self.impala_log_dir, "impalad." + level) 04:46:40 # Resolve symlinks to make finding the file easier. 04:46:40 log_file_path = os.path.realpath(log_file_path) 04:46:40 with open(log_file_path) as log_file: 04:46:40 for line in log_file: 04:46:40 if pattern.search(line): 04:46:40 found += 1 04:46:40 assert found == expected_count, ("Expected %d lines in file %s matching regex '%s'" 04:46:40 + ", but found %d lines. Last line was: \n%s") % (expected_count, log_file_path, 04:46:40 > line_regex, found, line) 04:46:40 E AssertionError: Expected 1 lines in file /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/impalad.impala-boost-static-burst-slave-105e.vpc.cloudera.com.jenkins.log.INFO.20170817-034950.31592 matching regex 'Wrote minidump to ', but found 2 lines. Last line was: 04:46:40 E Wrote minidump to /tmp/tmpIwVjcx/impalad/2a84fe94-2f54-4835-2e835cae-7c485c64.dmp 04:46:40 04:46:40 common/custom_cluster_test_suite.py:159: AssertionError