Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12882

wait_for_metric_value() doesn't log the current value if allow_greater=True

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Infrastructure
    • None
    • ghx-label-1

    Description

      I saw test failures due to the desired catalog update timeout to be applied on the impalad:

      -- 2024-02-22 07:40:09,898 INFO     MainThread: Started query 2b40be0cac7fd641:609759ce00000000
      -- 2024-02-22 07:41:10,139 INFO     MainThread: Waiting until events processor syncs to event id:40299
      -- 2024-02-22 07:41:10,618 DEBUG    MainThread: Metric last-synced-event-id has been increased to 40294 but has not yet             reached the desired value: 40299
      -- 2024-02-22 07:41:10,724 DEBUG    MainThread: Metric last-synced-event-id has reached the desired value: 40299
      -- 2024-02-22 07:41:11,008 DEBUG    MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
      -- 2024-02-22 07:41:11,046 INFO     MainThread: Getting metric: catalog.curr-version from ip-172-31-56-39:25000
      -- 2024-02-22 07:41:11,053 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-02-22 07:41:12,054 INFO     MainThread: Getting metric: catalog.curr-version from ip-172-31-56-39:25000
      -- 2024-02-22 07:41:12,061 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-02-22 07:41:13,062 INFO     MainThread: Getting metric: catalog.curr-version from ip-172-31-56-39:25000
      -- 2024-02-22 07:41:13,067 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-02-22 07:41:14,067 INFO     MainThread: Getting metric: catalog.curr-version from ip-172-31-56-39:25000
      -- 2024-02-22 07:41:14,075 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-02-22 07:41:15,077 INFO     MainThread: Getting metric: catalog.curr-version from ip-172-31-56-39:25000
      -- 2024-02-22 07:41:15,082 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-02-22 07:41:16,082 INFO     MainThread: Getting metric: catalog.curr-version from ip-172-31-56-39:25000
      -- 2024-02-22 07:41:16,086 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-02-22 07:41:17,087 INFO     MainThread: Getting metric: catalog.curr-version from ip-172-31-56-39:25000
      -- 2024-02-22 07:41:17,096 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-02-22 07:41:18,097 INFO     MainThread: Getting metric: catalog.curr-version from ip-172-31-56-39:25000
      -- 2024-02-22 07:41:18,113 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-02-22 07:41:19,114 INFO     MainThread: Getting metric: catalog.curr-version from ip-172-31-56-39:25000
      -- 2024-02-22 07:41:19,120 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-02-22 07:41:20,122 INFO     MainThread: Getting metric: catalog.curr-version from ip-172-31-56-39:25000
      -- 2024-02-22 07:41:20,129 INFO     MainThread: Sleeping 1s before next retry.
      -- 2024-02-22 07:41:21,130 INFO     MainThread: Metric catalog.curr-version did not reach value 8467 in 10s. Failing... 

      Unfortunately, the current value of "catalog.curr-version" is not logged so we don't know whether impalad is hanging or moving forward slowly.

      The logs come from wait_for_metric_value(). When allow_greater=True, the current values are not logged:

            # if allow_greater is True we wait until the metric value becomes >= the expected
            # value.
            if allow_greater:
              if value >= expected_value:
                LOG.info("Metric '%s' has reached desired value: %s" % (metric_name, value))
                return value
            elif value == expected_value:
              LOG.info("Metric '%s' has reached desired value: %s" % (metric_name, value))
              return value
            else:
              LOG.info("Waiting for metric value '%s'%s%s. Current value: %s" %
                  (metric_name, '>=' if allow_greater else '=', expected_value, value))
            LOG.info("Sleeping %ds before next retry." % interval)
            sleep(interval)

      https://github.com/apache/impala/blob/54642b2549c38625c868f09815b61170e568cdb9/tests/common/impala_service.py#L129-L138

      Attachments

        Activity

          People

            stigahuang Quanlong Huang
            stigahuang Quanlong Huang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: