Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-3129

ToolTest.TestHmsList can timeout

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.12.0
    • None
    • hms, test
    • None

    Description

      When running in TSAN mode, the test timed out, spending 10 minutes not really doing anything. It isn't obvious why, but ToolTest.TestHmsList can timeout, appearing to hang while running the HMS tool.

      I0521 22:31:49.436857  4601 catalog_manager.cc:1161] Initializing in-progress tserver states...
      I0521 22:31:49.446161  4606 hms_notification_log_listener.cc:228] Skipping Hive Metastore notification log poll: Service unavailable: Catalog manager is not initialized. State: Starting
      I0521 22:31:49.839709  4488 heartbeater.cc:325] Connected to a master server at 127.0.89.254:42487
      I0521 22:31:49.845547  4559 master_service.cc:295] Got heartbeat from unknown tserver (permanent_uuid: "cf9e08c4271e4d9aa28b1aacbd630908" instance_seqno: 1590100304311876) as {username='slave'} at 127.0.89.193:33867; Asking this server to re-register.
      I0521 22:31:49.846786  4488 heartbeater.cc:416] Registering TS with master...
      I0521 22:31:49.847297  4488 heartbeater.cc:465] Master 127.0.89.254:42487 requested a full tablet report, sending...
      I0521 22:31:49.849771  4559 ts_manager.cc:191] Registered new tserver with Master: cf9e08c4271e4d9aa28b1aacbd630908 (127.0.89.193:43527)
      I0521 22:31:49.852535   359 external_mini_cluster.cc:699] 1 TS(s) registered with all masters
      W0521 22:32:23.142868  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b060 after lost signal to thread 4531
      W0521 22:32:23.143333  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b780 after lost signal to thread 4591
      W0521 22:32:28.996440  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b740 after lost signal to thread 4531
      W0521 22:32:28.996966  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b520 after lost signal to thread 4591
      W0521 22:33:05.743249  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002aae0 after lost signal to thread 4360
      W0521 22:33:05.743983  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002af00 after lost signal to thread 4486
      I0521 22:33:49.594769  4549 maintenance_manager.cc:326] P c3cc85c33a5447b2aa520019fe162966: Scheduling FlushMRSOp(00000000000000000000000000000000): perf score=0.033386
      I0521 22:33:49.637208  4548 maintenance_manager.cc:525] P c3cc85c33a5447b2aa520019fe162966: FlushMRSOp(00000000000000000000000000000000) complete. Timing: real 0.042s	user 0.032s	sys 0.008s Metrics: {"bytes_written":6485,"cfile_init":1,"dirs.queue_time_us":675,"dirs.run_cpu_time_us":237,"dirs.run_wall_time_us":997,"drs_written":1,"lbm_read_time_us":231,"lbm_reads_lt_1ms":4,"lbm_write_time_us":1980,"lbm_writes_lt_1ms":27,"rows_written":5,"thread_start_us":953,"threads_started":2,"wal-append.queue_time_us":819}
      I0521 22:33:49.639096  4549 maintenance_manager.cc:326] P c3cc85c33a5447b2aa520019fe162966: Scheduling UndoDeltaBlockGCOp(00000000000000000000000000000000): 396 bytes on disk
      I0521 22:33:49.640486  4548 maintenance_manager.cc:525] P c3cc85c33a5447b2aa520019fe162966: UndoDeltaBlockGCOp(00000000000000000000000000000000) complete. Timing: real 0.001s	user 0.001s	sys 0.000s Metrics: {"cfile_init":1,"lbm_read_time_us":269,"lbm_reads_lt_1ms":4}
      W0521 22:34:17.794472  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002ade0 after lost signal to thread 4360
      W0521 22:34:17.795437  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002a7e0 after lost signal to thread 4486
      W0521 22:34:20.286921  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b2e0 after lost signal to thread 4531
      W0521 22:34:20.287376  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b140 after lost signal to thread 4591
      W0521 22:35:27.726336  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002af40 after lost signal to thread 4360
      W0521 22:35:27.727084  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002a980 after lost signal to thread 4486
      W0521 22:36:12.250830  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b9c0 after lost signal to thread 4531
      W0521 22:36:12.251247  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b220 after lost signal to thread 4591
      W0521 22:36:34.702325  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b800 after lost signal to thread 4531
      W0521 22:36:34.702808  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b700 after lost signal to thread 4591
      W0521 22:37:03.237411  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002ab20 after lost signal to thread 4360
      W0521 22:37:03.238131  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002a8e0 after lost signal to thread 4486
      W0521 22:37:27.446748  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b320 after lost signal to thread 4531
      W0521 22:37:27.447190  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080008a460 after lost signal to thread 4591
      W0521 22:38:26.852612  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b7c0 after lost signal to thread 4531
      W0521 22:38:26.853031  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b8c0 after lost signal to thread 4591
      W0521 22:38:28.039356  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002a9a0 after lost signal to thread 4360
      W0521 22:38:28.040096  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002ad80 after lost signal to thread 4486
      W0521 22:38:45.889757  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002aa60 after lost signal to thread 4360
      W0521 22:38:45.890512  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002a0a0 after lost signal to thread 4486
      W0521 22:39:57.414119  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002ae60 after lost signal to thread 4360
      W0521 22:39:57.414861  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002a140 after lost signal to thread 4486
      W0521 22:40:04.007081  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b240 after lost signal to thread 4531
      W0521 22:40:04.007515  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b8a0 after lost signal to thread 4591
      W0521 22:40:41.883837  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002a880 after lost signal to thread 4360
      W0521 22:40:41.884637  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002ace0 after lost signal to thread 4486
      W0521 22:41:06.952173  4380 debug-util.cc:402] Leaking SignalData structure 0x7b0800058020 after lost signal to thread 4360
      W0521 22:41:06.952941  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002a6e0 after lost signal to thread 4486
      W0521 22:41:36.942020  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b560 after lost signal to thread 4531
      W0521 22:41:36.942495  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b4a0 after lost signal to thread 4591
      W0521 22:42:02.273381  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002bb20 after lost signal to thread 4531
      W0521 22:42:02.273813  4545 debug-util.cc:402] Leaking SignalData structure 0x7b080002b0e0 after lost signal to thread 4591
      W0521 22:42:37.898852  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002acc0 after lost signal to thread 4360
      W0521 22:42:37.899608  4380 debug-util.cc:402] Leaking SignalData structure 0x7b080002aca0 after lost signal to thread 4486
      ************************ BEGIN STACKS **************************
      [New LWP 4501]
      [New LWP 4500]
      ...
      Thread 1 (Thread 0x7ff8a61a2400 (LWP 359)):
      #0  0x00007ff892b40693 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
      #1  0x00000000004db89e in __interceptor_epoll_wait (epfd=31, ev=0x7b580000c900, cnt=64, timeout=59743) at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/src/llvm-9.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1829
      #2  0x00007ff89571275d in epoll_poll (loop=0x7b64000d7a00, timeout=59.743000000000002) at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/src/libev-4.20/ev_epoll.c:153
      #3  0x00007ff8957068c2 in ev_run (loop=0x7b64000d7a00, flags=0) at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/src/libev-4.20/ev.c:3521
      #4  0x00007ff898d8f539 in ev::loop_ref::run (this=0x7ffcdcd2ca40, flags=51456) at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/tsan/include/ev++.h:211
      #5  0x00007ff896d57b1d in kudu::(anonymous namespace)::ReadFdsFully (progname=..., fds=..., out=0x7ffcdcd2cb70) at /home/jenkins-slave/workspace/kudu-master/0/src/kudu/util/subprocess.cc:242
      #6  0x00007ff896d5754a in kudu::Subprocess::Call (argv=..., stdin_in=..., stdout_out=<optimized out>, stderr_out=0x7ffcdcd2cd78, env_vars=...) at /home/jenkins-slave/workspace/kudu-master/0/src/kudu/util/subprocess.cc:771
      #7  0x00007ff8a5781532 in kudu::tools::RunKuduTool (args=..., out=0x7ffcdcd2cd60, err=0x7ffcdcd2cd78, in=...) at /home/jenkins-slave/workspace/kudu-master/0/src/kudu/tools/tool_test_util.cc:77
      #8  0x00000000005f0905 in kudu::tools::ToolTest::RunTool (this=<optimized out>, arg_str=..., stdout=0x7ffcdcd2d580, stderr=0x7ffcdcd2cdf8, stdout_lines=0x0, stderr_lines=0x0, in=...) at /home/jenkins-slave/workspace/kudu-master/0/src/kudu/tools/kudu-tool-test.cc:246
      #9  0x00000000005f52ea in kudu::tools::ToolTest::RunActionStdoutString (this=0x7b2000001080, arg_str=..., stdout=0x7ffcdcd2d580) at /home/jenkins-slave/workspace/kudu-master/0/src/kudu/tools/kudu-tool-test.cc:283
      #10 0x00000000005cb481 in kudu::tools::ToolTest_TestHmsList_Test::TestBody (this=0x7b2000001080) at /home/jenkins-slave/workspace/kudu-master/0/src/kudu/tools/kudu-tool-test.cc:4588
      #11 0x00007ff8a3d32fa0 in HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=0x7b2000001080, method=<optimized out>, location=0x7ff8a3d50c8a "the test body") at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2402
      #12 testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=<optimized out>, method=<optimized out>, location=0x7ff8a3d50c8a "the test body") at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2438
      #13 0x00007ff8a3d11f39 in testing::Test::Run (this=0x7b2000001080) at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2474
      #14 0x00007ff8a3d131dd in testing::TestInfo::Run (this=0x7b3c00001d10) at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2656
      #15 0x00007ff8a3d13ef7 in testing::TestCase::Run (this=<optimized out>) at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2774
      #16 0x00007ff8a3d203a7 in testing::internal::UnitTestImpl::RunAllTests (this=<optimized out>) at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:4649
      #17 0x00007ff8a3d33e50 in HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x7b5000000200, method=<optimized out>, location=0x7ff8a3d5138b "auxiliary test code (environments or event listeners)") at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2402
      #18 testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=<optimized out>, method=<optimized out>, location=0x7ff8a3d5138b "auxiliary test code (environments or event listeners)") at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:2438
      #19 0x00007ff8a3d1fc62 in testing::UnitTest::Run (this=<optimized out>) at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/src/googletest-release-1.8.0/googletest/src/gtest.cc:4257
      #20 0x00007ff8a45b570c in RUN_ALL_TESTS () at /home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/tsan/include/gtest/gtest.h:2233
      #21 0x00007ff8a45b4587 in main (argc=1, argv=0x7ffcdcd2daf8) at /home/jenkins-slave/workspace/kudu-master/0/src/kudu/util/test_main.cc:106
      #22 0x00007ff892a63f45 in __libc_start_main (main=0x4c1470 <main@plt>, argc=3, argv=0x7ffcdcd2dae8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffcdcd2dad8) at libc-start.c:287
      #23 0x00000000004c8ea8 in _start ()
      

      Attachments

        1. kudu-tool-test.2.txt
          667 kB
          Andrew Wong
        2. kudu-tool-test.2.txt.xz
          9 kB
          Alexey Serbin

        Activity

          People

            Unassigned Unassigned
            awong Andrew Wong
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: