Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10579

Deadloop in table metadata loading when using an invalid RemoteIterator

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • Impala 3.4.0
    • Impala 4.0.0
    • Catalog
    • None
    • ghx-label-13

    Description

      The file listing thread in catalogd will go into a dead loop if it gets a RemoteIterator on a non-existing path. The first call of the RemoteIterator.hasNext() will throw a FileNotFoundException. However, this exception will be catched and the loop will continue, which results in a dead loop. Related codes: https://github.com/apache/impala/blob/d89c04bf806682d3449c566ce979632bd2ac5b29/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L789-L814

        static class FilterIterator implements RemoteIterator<FileStatus> {
          ...
          public boolean hasNext() throws IOException {
            ...
            while (curFile_ == null) {
              FileStatus next;
              try {
                if (!baseIterator_.hasNext()) return false; // <---- throws FileNotFoundException
                ...
                next = baseIterator_.next();
              } catch (FileNotFoundException ex) {
                ...
                LOG.warn(ex.getMessage());
                continue;  // <--------- catch the exception and continue into a dead loop
              }
              if (!isInIgnoredDirectory(startPath_, next)) {
                curFile_ = next;
                return true;
              }
            }
            return true;
          }
      

      When will the path to be loading not exist?
      It happens when metadata (table/partition location) in HMS still have the path. But it's actually removed from the storage.

      When will impala get such an invalid RemoteIterator?
      For FileSystem implementations that don't override the FileSystem#listStatusIterator() interface, e.g. S3AFileSystem before HADOOP-17281, AzureBlobFileSystem, and GoogleHadoopFileSystem.

      Attachments

        Issue Links

          Activity

            People

              stigahuang Quanlong Huang
              stigahuang Quanlong Huang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: