Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17802

Remove unnecessary calls to FileSystem.setOwner() from FileOutputCommitterContainer

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 2.2.0, 3.0.0
    • None
    • HCatalog
    • None
    • YHIVE-751

    Description

      For large Pig/HCat queries that produce a large number of partitions/directories/files, we have seen cases where the HDFS NameNode groaned under the weight of FileSystem.setOwner() calls, originating from the commit-step. This was the result of the following code in FileOutputCommitterContainer:

      private void applyGroupAndPerms(FileSystem fs, Path dir, FsPermission permission,
                        List<AclEntry> acls, String group, boolean recursive)
          throws IOException {
      ...
          if (recursive) {
            for (FileStatus fileStatus : fs.listStatus(dir)) {
              if (fileStatus.isDir()) {
                applyGroupAndPerms(fs, fileStatus.getPath(), permission, acls, group, true);
              } else {
                fs.setPermission(fileStatus.getPath(), permission);
                chown(fs, fileStatus.getPath(), group);
              }
            }
          }
        }
      
        private void chown(FileSystem fs, Path file, String group) throws IOException {
          try {
            fs.setOwner(file, null, group);
          } catch (AccessControlException ignore) {
            // Some users have wrong table group, ignore it.
            LOG.warn("Failed to change group of partition directories/files: " + file, ignore);
          }
        }
      

      One call per file/directory is far too many. We have a patch that reduces the namenode pressure.

      Attachments

        1. HIVE-17802.1.patch
          30 kB
          Mithun Radhakrishnan
        2. HIVE-17802.2.patch
          36 kB
          Mithun Radhakrishnan
        3. HIVE-17802.2-branch-2.patch
          30 kB
          Mithun Radhakrishnan

        Issue Links

          Activity

            People

              cdrome Chris Drome
              mithun Mithun Radhakrishnan
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: