Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-16673

Add filter parameter to FileSystem>>listFiles

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 3.2.2
    • None
    • fs, fs/s3
    • None

    Description

      Currently getting recursively a filtered list of files in a directory is clumsy because filtering should happen afterwards on the result list.

      Imagine we want to list all non hidden files recursively.

      The non hidden files filter is defined as: 

      !name.startsWith("_") && !name.startsWith(".") 

       

      Then we can do:

       

      RemoteIterator<LocatedFileStatus> remoteIterator = fs.listFiles(path, /*recursive*/true);
      while (remoteIterator.hasNext()) {
       LocatedFileStatus each = remoteIterator.next();
       if (filter applies to all of the path elements in each) {
         result.add(each);
       }
      }
       
      

       

      For example each of these paths should be skipped:

      • /.a/b/c
      • /a/.b/c
      • /a/b/.c/

      It would be lot better to have a filter parameter on listFiles. This is needed to solve HIVE-22411 effectively. 

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              amagyar Attila Magyar
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: