Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16934

org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      jenkins test failure as the logged output is in the wrong order for the assertions. HDFS-16624 flipped the order...without that this would have worked.

      
      java.lang.AssertionError
      	at org.junit.Assert.fail(Assert.java:87)
      	at org.junit.Assert.assertTrue(Assert.java:42)
      	at org.junit.Assert.assertTrue(Assert.java:53)
      	at org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1149)
      

      Here the code is asserting about the contents of the output,

          assertTrue(outs.get(0).startsWith("Reconfiguring status for node"));
          assertTrue("SUCCESS: Changed property dfs.datanode.peer.stats.enabled".equals(outs.get(2))
              || "SUCCESS: Changed property dfs.datanode.peer.stats.enabled".equals(outs.get(1)));  // here
          assertTrue("\tFrom: \"false\"".equals(outs.get(3)) || "\tFrom: \"false\"".equals(outs.get(2)));
          assertTrue("\tTo: \"true\"".equals(outs.get(4)) || "\tTo: \"true\"".equals(outs.get(3)))
      

      If you look at the log, the actual line is appearing in that list, just in a different place. race condition

      2023-02-24 01:02:06,275 [Listener at localhost/41795] INFO  tools.TestDFSAdmin (TestDFSAdmin.java:testAllDatanodesReconfig(1146)) - dfsadmin -status -livenodes output:
      2023-02-24 01:02:06,276 [Listener at localhost/41795] INFO  tools.TestDFSAdmin (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - Reconfiguring status for node [127.0.0.1:41795]: started at Fri Feb 24 01:02:03 GMT 2023 and finished at Fri Feb 24 01:02:03 GMT 2023.
      2023-02-24 01:02:06,276 [Listener at localhost/41795] INFO  tools.TestDFSAdmin (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - Reconfiguring status for node [127.0.0.1:34007]: started at Fri Feb 24 01:02:03 GMT 2023SUCCESS: Changed property dfs.datanode.peer.stats.enabled
      2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  tools.TestDFSAdmin (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - 	From: "false"
      2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  tools.TestDFSAdmin (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - 	To: "true"
      2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  tools.TestDFSAdmin (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) -  and finished at Fri Feb 24 01:02:03 GMT 2023.
      2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO  tools.TestDFSAdmin (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - SUCCESS: Changed property dfs.datanode.peer.stats.enabled
      

      we have a race condition in output generation and the assertions are clearly too brittle

      for the 3.3.5 release I'm not going to make this a blocker. What i will do is propose that the asserts move to assertJ with an assertion that the collection "containsExactlyInAnyOrder" all the strings.

      That will
      1. not be brittle.
      2. give nice errors on failure

      Attachments

        Issue Links

          Activity

            People

              slfan1989 Shilun Fan
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: