Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-10156

Creating Materialized views concurrently leads to missing data

    XMLWordPrintableJSON

Details

    • Normal

    Description

      nutbunnies was writing dtests that create multiple tables concurrently. He also wrote a test that creates multiple MV but has not been able to get it works properly. After some debugging outside of dtest, it seems that there is an issue if we create more than 1 MV at the same time. There is no errors in the log but the MV are never entirely populated and are missing data.

      I've attached 2 scripts:

      mv_test_bad.sh: is the one that reproduce the issue. It creates 4 MVs at the same time. At the end, some data are missing in the MVs and there is nothing in system.hints or system.batchlog.

      mv_test_good.sh: is the same script but that waits 10 seconds between each MV creation, which results in 4 MVs with all the data.

      Some more notes from Andrew:

      - lowering the number of rows inserted below ~1000 won't exhibit the inconsistent behavior
      - adding more columns/MV make it worse -- more of the MVs counts are consistently wrong
      - multiple runs will range in disagreement -- usually one of the MVs is correct though
      - the describe cluster and system.mv* queries always "look" good
      

      Thanks Andrew for finding this bug and providing the test scripts!

      //cc carlyeks tjake enigmacurry

      Attachments

        1. mv_test_good.sh
          3 kB
          Alan Boudreault
        2. mv_test_bad.sh
          2 kB
          Alan Boudreault

        Activity

          People

            tjake T Jake Luciani
            aboudreault Alan Boudreault
            T Jake Luciani
            Carl Yeksigian
            Alan Boudreault Alan Boudreault
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: