Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-9979

When a app expired with many containers , scheduler event size will be huge

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • resourcemanager, scheduler
    • None

    Description

      When there is an app expired with many containers, the scheduler event size will be huge.

      2019-11-11,21:39:49,690 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 9000
      2019-11-11,21:39:49,695 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 10000
      2019-11-11,21:39:49,700 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 11000
      2019-11-11,21:39:49,705 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 12000
      2019-11-11,21:39:49,710 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 13000
      2019-11-11,21:39:49,715 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 14000
      2019-11-11,21:39:49,720 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Discarded 1 messages due to full event buffer including: Size of scheduler event-queue is 15000
      2019-11-11,21:39:49,724 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 16000
      2019-11-11,21:39:49,729 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 17000
      2019-11-11,21:39:49,733 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 18000
      2019-11-11,21:40:14,953 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 19000
      2019-11-11,21:43:09,743 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 19000
      2019-11-11,21:43:09,750 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 20000
      2019-11-11,21:43:09,758 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 21000
      2019-11-11,21:43:09,766 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 22000
      2019-11-11,21:43:09,775 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 23000
      2019-11-11,21:43:09,783 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 24000
      2019-11-11,21:43:09,792 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 25000
      2019-11-11,21:43:09,800 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 26000
      2019-11-11,21:43:09,807 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 27000
      2019-11-11,21:43:09,814 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 28000
      2019-11-11,21:46:29,830 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 29000
      2019-11-11,21:46:29,841 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 30000
      2019-11-11,21:46:29,850 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 31000
      2019-11-11,21:46:29,862 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 32000
      2019-11-11,21:49:49,875 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 33000
      2019-11-11,21:49:49,875 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 34000
      2019-11-11,21:49:49,876 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 35000
      2019-11-11,21:49:49,882 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 36000
      2019-11-11,21:49:49,887 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 37000
      2019-11-11,21:49:49,891 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 38000
      2019-11-11,21:49:49,896 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 39000
      2019-11-11,21:49:49,900 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 40000
      2019-11-11,21:49:49,905 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 41000
      2019-11-11,21:49:49,910 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 42000
      2019-11-11,21:49:49,914 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 43000
      2019-11-11,21:49:49,919 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 44000
      2019-11-11,21:49:49,923 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 45000
      2019-11-11,21:49:49,927 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 46000
      2019-11-11,21:49:49,932 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 47000
      2019-11-11,21:49:49,938 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 48000
      2019-11-11,21:49:49,943 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 49000
      2019-11-11,21:49:49,947 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 50000
      2019-11-11,21:49:49,951 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 51000
      2019-11-11,21:49:49,956 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 52000
      2019-11-11,21:49:49,961 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 53000
      2019-11-11,21:49:49,967 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 54000
      2019-11-11,21:49:49,972 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 55000
      2019-11-11,21:49:49,976 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 56000
      2019-11-11,21:49:49,980 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 57000
      2019-11-11,21:49:49,983 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 58000
      2019-11-11,21:49:49,988 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 59000
      2019-11-11,21:49:49,991 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 60000
      2019-11-11,21:49:49,996 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 61000
      2019-11-11,21:53:10,004 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 61000
      2019-11-11,21:53:10,014 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 62000
      2019-11-11,21:53:10,022 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 63000
      2019-11-11,21:53:10,032 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 64000
      2019-11-11,21:53:10,034 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 65000
      2019-11-11,21:53:10,040 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 66000
      2019-11-11,21:53:10,046 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 67000
      2019-11-11,21:56:30,056 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 68000
      2019-11-11,21:56:30,067 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 69000
      2019-11-11,21:56:30,077 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 70000
      2019-11-11,21:56:30,086 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 71000
      2019-11-11,21:56:30,094 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 72000
      2019-11-11,21:56:30,102 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 73000
      2019-11-11,21:56:30,107 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 74000
      2019-11-11,21:56:30,111 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 75000
      2019-11-11,21:56:30,116 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 76000
      2019-11-11,21:56:30,122 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 77000
      2019-11-11,21:59:50,128 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 78000
      2019-11-11,21:59:50,135 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 79000
      2019-11-11,21:59:50,140 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 80000
      2019-11-11,21:59:50,145 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 81000
      2019-11-11,21:59:50,149 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 82000
      2019-11-11,21:59:50,154 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 83000
      2019-11-11,21:59:50,159 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 84000
      2019-11-11,21:59:50,164 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 85000
      2019-11-11,21:59:50,168 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 86000
      2019-11-11,21:59:52,305 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 87000
      2019-11-11,22:03:10,175 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 87000
      2019-11-11,22:03:10,181 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 88000
      2019-11-11,22:03:10,186 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 89000
      2019-11-11,22:03:10,191 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 90000
      2019-11-11,22:03:10,196 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 91000
      2019-11-11,22:03:10,201 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 92000
      2019-11-11,22:03:10,206 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Discarded 1 messages due to full event buffer including: Size of scheduler event-queue is 93000
      2019-11-11,22:03:10,211 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 94000
      2019-11-11,22:03:10,215 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Discarded 1 messages due to full event buffer including: Size of scheduler event-queue is 95000
      2019-11-11,22:06:30,221 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 96000
      2019-11-11,22:06:30,227 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 97000
      2019-11-11,22:06:30,234 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 98000
      2019-11-11,22:06:30,240 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 99000
      2019-11-11,22:06:30,245 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 100000
      2019-11-11,22:06:30,250 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 101000
      2019-11-11,22:07:40,962 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 102000
      2019-11-11,22:09:50,259 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 91000
      2019-11-11,22:09:50,269 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 92000
      2019-11-11,22:09:50,278 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 93000
      2019-11-11,22:09:50,287 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 94000
      2019-11-11,22:09:50,295 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 95000
      2019-11-11,22:09:50,302 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 96000
      2019-11-11,22:09:50,310 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 97000
      2019-11-11,22:13:03,082 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 53000
      2019-11-11,22:13:10,318 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 54000
      2019-11-11,22:13:10,324 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 55000
      2019-11-11,22:13:10,330 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 56000
      2019-11-11,22:13:10,338 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 57000
      2019-11-11,22:13:10,347 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 58000
      2019-11-11,22:13:10,354 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Size of scheduler event-queue is 59000
      

      Container expired at given time:

      [work@xxx zhoukang-yarn]$ grep "21:39:" expired.1 | wc -l
      11377
      [work@xxx zhoukang-yarn]$ grep "21:43:" expired.1 | wc -l
      10508
      [work@xxx zhoukang-yarn]$ grep "21:49:" expired.1 | wc -l
      29269
      
      private class PingChecker implements Runnable {
      
          @Override
          public void run() {
            while (!stopped && !Thread.currentThread().isInterrupted()) {
              synchronized (AbstractLivelinessMonitor.this) {
                Iterator<Map.Entry<O, Long>> iterator = running.entrySet().iterator();
      
                // avoid calculating current time everytime in loop
                long currentTime = clock.getTime();
      
                while (iterator.hasNext()) {
                  Map.Entry<O, Long> entry = iterator.next();
                  O key = entry.getKey();
                  long interval = getExpireInterval(key);
                  if (currentTime > entry.getValue() + interval) {
                    iterator.remove();
                    expire(key);
                    LOG.info("Expired:" + entry.getKey().toString()
                        + " Timed out after " + interval / 1000 + " secs");
                  }
                }
              }
              try {
                Thread.sleep(monitorInterval);
              } catch (InterruptedException e) {
                LOG.info(getName() + " thread interrupted");
                break;
              }
            }
          }
        }
      

      Attachments

        1. YARN-9979.001.patch
          3 kB
          zhoukang

        Activity

          People

            cane zhoukang
            cane zhoukang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: