Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7392

Exclude some files when requesting directory

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • 1.16.0
    • None
    • None

    Description

      Currently Drill ignores files starting with dot ('.') or underscore ('_').

      When requesting directory with file of different types or different schema and present at multiple levels of the tree file, it will be useful/more flexible, to have also option(s) to exclude some files by extension or maybe with a regexp.

      For Example:

      myTable
      |--D1
         |--file1.csv
         |-file2.csv
      |--D2
         | SubD2
            |--file1.csv
         |--file1.csv
         |--file1.xml 
         |--file1.json
      

      without enter in a debate of what is a good the organisation/disposition for the data, currently to request all the csv files of this example, the way is:

      SELECT * FROM ....`myTable/*/*.csv`
      UNION
      SELECT * FROM ....`myTable/*/*/*.csv`
      

      It will be useful to have the capacity to request directly myTable like:

      /* ALTER SESSION SET exclude_files='xml,json' */
      /* or */
      /* ALTER SESSION SET only_files='csv' */
      SELECT * FROM myTable
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            benj641 benj
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: