Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-4365

ShellServerIT#trace() failing intermittently due to missing "sendMutations" block

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.7.4, 1.9.0
    • test
    • None

    Description

      Noticed this on master, but not sure if it also affects other branches.

      trace(org.apache.accumulo.test.ShellServerIT)  Time elapsed: 5.166 sec  <<< FAILURE!
      java.lang.AssertionError
      	at org.apache.accumulo.test.ShellServerIT.trace(ShellServerIT.java:1630)
      

      This is a trace that was observed when the test case failed.

      Trace started at 2016/07/10 22:43:38.277
      Time  Start  Service@Location       Name
       3446+0      shell@localhost shell:root
          1+160      master@0.0.0.0 beginFateOperation
          4+167      master@0.0.0.0 executeFateOperation
          3+173        master@0.0.0.0 CreateTable
          2+176        master@0.0.0.0 CreateTable
         16+181          master@0.0.0.0 SetupPermissions
          4+200            master@0.0.0.0 PopulateZookeeper
         19+204            master@0.0.0.0 PopulateZookeeper
          1+694              master@0.0.0.0 ChooseDir
          1+709                master@0.0.0.0 CreateDir
          2+712                  master@0.0.0.0 PopulateMetadata
          1+713                    tserver@0.0.0.0 update
          1+713                      tserver@0.0.0.0 prep
          5+716                    master@0.0.0.0 FinishCreateTable
        563+172      master@0.0.0.0 waitForFateOperation
          2+736      master@0.0.0.0 finishFateOperation
       1513+745      shell@localhost close
         13+746        shell@localhost BinMutations 1
          5+746          shell@localhost binMutations
          2+748            tserver@0.0.0.0 startScan
          1+748              tserver@0.0.0.0 metadata tablets read ahead 5
          3+2259     tserver@0.0.0.0 getTableConfiguration
          3+2263     tserver@0.0.0.0 getTableConfiguration
          3+2267     tserver@0.0.0.0 getTableConfiguration
          3+2270     tserver@0.0.0.0 getTableConfiguration
          3+2281     shell@localhost scan
          2+2282       shell@localhost scan:location
          2+2282         tserver@0.0.0.0 startScan
          2+2282           tserver@0.0.0.0 tablet read ahead 6
          7+2285     master@0.0.0.0 beginFateOperation
          2+2293     master@0.0.0.0 executeFateOperation
          3+2297       master@0.0.0.0 DeleteTable
          1+2300       master@0.0.0.0 DeleteTable
          4+2413         master@0.0.0.0 CleanUp
          2+2415           master@0.0.0.0 scan
          1+2415             master@0.0.0.0 scan:location
          1+2415               tserver@0.0.0.0 startScan
          1+2415                 tserver@0.0.0.0 metadata tablets read ahead 6
         20+2417         master@0.0.0.0 CleanUp
          2+2417           master@0.0.0.0 batch scanner 555- 1
          1+2417             master@0.0.0.0 client:startMultiScan
          1+2418             tserver@0.0.0.0 startMultiScan
          1+2418               tserver@0.0.0.0 metadata tablets read ahead 7
          1+2420           master@0.0.0.0 scan
          1+2420             master@0.0.0.0 scan:location
          1+2420               tserver@0.0.0.0 startScan
          1+2420                 tserver@0.0.0.0 metadata tablets read ahead 1
          2+2421           master@0.0.0.0 close
          1+2421             master@0.0.0.0 BinMutations 1
          1+2421               master@0.0.0.0 binMutations
          1+2423           master@0.0.0.0 scan
          1+2423             master@0.0.0.0 scan:location
          1+2423               tserver@0.0.0.0 startScan
          1+2423                 tserver@0.0.0.0 metadata tablets read ahead 8
        145+2296     master@0.0.0.0 waitForFateOperation
          1+2441     master@0.0.0.0 finishFateOperation
      

      In another run where the test did not fail:

      Trace started at 2016/07/10 22:48:06.432
      Time  Start  Service@Location       Name
       3066+0      shell@localhost shell:root
          5+210      master@0.0.0.0 beginFateOperation
          4+222      master@0.0.0.0 executeFateOperation
          2+228        master@0.0.0.0 CreateTable
          2+230        master@0.0.0.0 CreateTable
         15+235          master@0.0.0.0 SetupPermissions
          1+252            master@0.0.0.0 PopulateZookeeper
         10+253            master@0.0.0.0 PopulateZookeeper
          2+266              master@0.0.0.0 ChooseDir
         70+227      master@0.0.0.0 waitForFateOperation
          2+298      master@0.0.0.0 finishFateOperation
       1511+306      shell@localhost close
          9+306        shell@localhost BinMutations 1
          5+306          shell@localhost binMutations
          2+308            tserver@0.0.0.0 startScan
          1+308              tserver@0.0.0.0 metadata tablets read ahead 5
          6+1818     tserver@0.0.0.0 getTableConfiguration
          3+1825     tserver@0.0.0.0 getTableConfiguration
          4+1828     tserver@0.0.0.0 getTableConfiguration
          3+1833     tserver@0.0.0.0 getTableConfiguration
          1+1836     shell@localhost client:getUserAuthorizations
          3+1845     shell@localhost scan
          2+1846       shell@localhost scan:location
          2+1846         tserver@0.0.0.0 startScan
          1+1847           tserver@0.0.0.0 tablet read ahead 8
          7+1849     master@0.0.0.0 beginFateOperation
          3+1856     master@0.0.0.0 executeFateOperation
          2+1860       master@0.0.0.0 DeleteTable
          1+1862       master@0.0.0.0 DeleteTable
          5+2027         master@0.0.0.0 CleanUp
          4+2028           master@0.0.0.0 scan
          4+2028             master@0.0.0.0 scan:location
          2+2029               tserver@0.0.0.0 startScan
          1+2030                 tserver@0.0.0.0 metadata tablets read ahead 4
         24+2032         master@0.0.0.0 CleanUp
          3+2032           master@0.0.0.0 batch scanner 560- 1
          1+2032             master@0.0.0.0 client:startMultiScan
          1+2033             tserver@0.0.0.0 startMultiScan
          1+2033               tserver@0.0.0.0 metadata tablets read ahead 5
          2+2035           master@0.0.0.0 scan
          2+2035             master@0.0.0.0 scan:location
          1+2036               tserver@0.0.0.0 startScan
          1+2036                 tserver@0.0.0.0 metadata tablets read ahead 6
          2+2037           master@0.0.0.0 close
          2+2039           master@0.0.0.0 scan
          2+2039             master@0.0.0.0 scan:location
          1+2040               tserver@0.0.0.0 startScan
          1+2040                 tserver@0.0.0.0 metadata tablets read ahead 7
        200+1859     master@0.0.0.0 waitForFateOperation
          1+2060     master@0.0.0.0 finishFateOperation
      The following spans are not rooted (probably due to a parent span of length 0ms):
          2+273    master@0.0.0.0 PopulateMetadata
          1+274    tserver@0.0.0.0 update
          1+274    tserver@0.0.0.0 wal
          5+278    master@0.0.0.0 FinishCreateTable
          1+2038   master@0.0.0.0 sendMutations
          1+2038   tserver@0.0.0.0 update
          1+2038   master@0.0.0.0 org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter 1
          1+2038   tserver@0.0.0.0 prep
      

      Note how the only sendMutations is actually coming from the Master (and is unrooted for some reason...), not actually from the BatchWriter as we'd expected.

      ShawnWalker, maybe this is related to your changes in ACCUMULO-4191? Do you have any time to look into this?

      Attachments

        Issue Links

          Activity

            People

              milleruntime Michael Miller
              elserj Josh Elser
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5h 50m
                  5h 50m