XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • hdfs-client
    • None

    Description

      Occasionally "Version Mismatch (Expected: 28, Received: 22794 )" shows up in the logs. This doesn't happen much at all with less than 500 concurrent reads and starts happening often enough to be an issue at 1000 concurrent reads.

      I've seen 3 distinct numbers: 23050 (most common), 22538, and 22794. If you break these shorts into bytes you get

      23050 -> [90,10]
      22794 -> [89,10]
      22538 -> [88,10]
      

      Interestingly enough if we dump buffers holding protobuf messages just before they hit the wire we see things like the following with the first two bytes as 90,10

      buffer ={90,10,82,10,64,10,52,10,37,66,80,45,49,51,56,49,48,51,51,57,57,49,45,49,50,55,46,48,46,48,46,49,45,49,52,53,57,53,50,53,54,49,53,55,50,53,16,-127,-128,-128,-128,4,24,-23,7,32,-128,-128,64,18,8,10,0,18,0,26,0,34,0,18,14,108,105,98,104,100,102,115,43,43,95,75,67,43,49,16,0,24,23,32,1}
      

      The first 3 bytes the DN is expecting for an unsecured read block request =

      {0,28,81} //[0, 28]->a short for protocol, 81 is read block opcode
      

      This seems like either connections are getting swapped between readers or
      the header isn't being sent for some reason but the protobuf message is.

      I've ruled out memory stomps on the header data (see HDFS-10241) by sticking the 3 byte header in it's own static buffer that all requests use.

      Some notes:
      -The mismatched number will stay the same for the duration of a stress test.
      -The mismatch is distributed fairly evenly throughout the logs

      Attachments

        1. HDFS-10247.HDFS-8707.002.patch
          11 kB
          James Clampffer
        2. HDFS-10247.HDFS-8707.001.patch
          8 kB
          James Clampffer
        3. HDFS-10247.HDFS-8707.000.patch
          4 kB
          James Clampffer

        Activity

          People

            James Clampffer James Clampffer
            James Clampffer James Clampffer
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: