Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2516

Regex in__parse_duration_ms looks fragile

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.2.4
    • None
    • None

    Description

      Hi Alan,

      Looks like a recent change of yours introduced a python test (test_hash_join_timer.py)
      which attempts to parse some sort of run duration. However, the regular expression used doesn't seem to work for certain inputs:

        def __parse_duration_ms(self, duration):
          """Parses a duration string of the form 1h2h3m4s5.6ms into milliseconds."""
          matches = re.findall(r'([0-9]+h)?([0-9]+m)?([0-9]+s)?([0-9]+(\.[0-9]+)?ms)?'
                               r'([0-9]+(\.[0-9]+)?us)?([0-9]+(\.[0-9]+)?ns)?',
                               duration)
          # Expect exactly two matches because all groups are optional in the regex.
          if matches is None or len(matches) != 2:
            assert False, 'Failed to parse duration string %s' % duration
      
      >>> x = re.compile(r'([0-9]+h)?([0-9]+m)?([0-9]+s)?([0-9]+(\.[0-9]+)?ms)?([0-9]+(\.[0-9]+)?us)?([0-9]+(\.[0-9]+)?ns)?');
      >>> re.findall(x, '5ms');
      [('', '5m', '', '', '', '', '', '', ''), ('', '', '', '', '', '', '', '', ''), ('', '', '', '', '', '', '', '', '')]
      >>> y = re.findall(x, '5ms');
      >>> len(y)
      3
      
      >>> y = re.findall(x, '5.6ms');
      >>> len(y)
      2
      >>> y
      [('', '', '', '5.6ms', '.6', '', '', '', ''), ('', '', '', '', '', '', '', '', '')]
      

      It appears that the code is getting confused between minute 'm' and millisecond 'ms'.
      This is causing random gvm failure:

      http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge/1213/

      Attachments

        Activity

          People

            alan@cloudera.com Alan Choi
            kwho Michael Ho
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: