Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7475

Support impersonation on local file system

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.15.0
    • Future
    • None
    • None

    Description

      Hi,

      we'd like to setup Drill to as SQL interface for files stored on local file system (non HDFS) with multi user access - each user/group authorized to access only selected tables/views.

       

      Environment:

      CentOS 6.7

      [dataiku@quickstart apache-drill-1.16.0]$ lsb_release -a
      LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch
      Distributor ID: CentOS
      Description: CentOS release 6.7 (Final)
      Release: 6.7
      Codename: Final

      Apache Drill 1.16.0 with the following config (drill-override.conf, Drill running as root user):

      drill.exec: {
        cluster-id: "unit8drill",
        zk.connect: "localhost:2181",
        impersonation: {
          enabled: true,
        },
        security: {
          auth.mechanisms : ["PLAIN"],
        },
        security.user.auth: {
          enabled: true,
          packages += "org.apache.drill.exec.rpc.user.security",
          impl: "pam4j",
          pam_profiles: [ "sudo", "login" ],
        }
      }
      

      DFS Storage definition from Drill:

      {
        "type": "file",
        "connection": "file:///",
        "config": null,
        "workspaces": {
          "tmp": {
            "location": "/tmp",
            "writable": true,
            "defaultInputFormat": null,
            "allowAccessOutsideWorkspace": false
          },
          "drill": {
            "location": "/home/dataiku/drill_datasets",
            "writable": false,
            "defaultInputFormat": null,
            "allowAccessOutsideWorkspace": false
          },
          "views": {
            "location": "/home/dataiku/views",
            "writable": true,
            "defaultInputFormat": null,
            "allowAccessOutsideWorkspace": false
          }
        },
        "formats": {
          "psv": {
            "type": "text",
            "extensions": [
              "tbl"
            ],
            "delimiter": "|"
          },
          "csv": {
            "type": "text",
            "extensions": [
              "csv"
            ],
            "delimiter": ","
          },
          "tsv": {
            "type": "text",
            "extensions": [
              "tsv"
            ],
            "delimiter": "\t"
          },
          "httpd": {
            "type": "httpd",
            "logFormat": "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\""
          },
          "parquet": {
            "type": "parquet"
          },
          "json": {
            "type": "json",
            "extensions": [
              "json"
            ]
          },
          "pcap": {
            "type": "pcap"
          },
          "pcapng": {
            "type": "pcapng",
            "extensions": [
              "pcapng"
            ]
          },
          "avro": {
            "type": "avro"
          },
          "sequencefile": {
            "type": "sequencefile",
            "extensions": [
              "seq"
            ]
          },
          "csvh": {
            "type": "text",
            "extensions": [
              "csvh"
            ],
            "extractHeader": true,
            "delimiter": ","
          },
          "image": {
            "type": "image",
            "extensions": [
              "jpg",
              "jpeg",
              "jpe",
              "tif",
              "tiff",
              "dng",
              "psd",
              "png",
              "bmp",
              "gif",
              "ico",
              "pcx",
              "wav",
              "wave",
              "avi",
              "webp",
              "mov",
              "mp4",
              "m4a",
              "m4p",
              "m4b",
              "m4r",
              "m4v",
              "3gp",
              "3g2",
              "eps",
              "epsf",
              "epsi",
              "ai",
              "arw",
              "crw",
              "cr2",
              "nef",
              "orf",
              "raf",
              "rw2",
              "rwl",
              "srw",
              "x3f"
            ]
          }
        },
        "enabled": true
      }
      

      Created a view on local file system (not HDFS) that is configured to be accessible only by bob user:

      [xyz@quickstart views]$ ls -l
      total 4
      -rwx------ 1 bob bob 247 Dec  5 11:57 project_1_abc.view.drill
      

       

      Steps to reproduce:

      Use sqlline to query project_1_abc view as alice user:

      apache drill> select count(*) from dfs.views.project_1_abc;
      +--------+
      | EXPR$0 |
      +--------+
      | 418    |
      +--------+
      1 row selected (0.461 seconds)
      

      Expected result:

      Querying project_1_abc view as user alice should throw an error, as only bob user has access to this view.

       

      Actual result:

      User alice is able to query project_1_abc view even though she doesn't have permissions on file system. The question is, does Drill support RBAC on local file system? If so, what could we be doing wrong?

       

      Additional Information:

      The Drill process runs as root in order to have access to ```/etc/shadow``` etc.

       Authentication works fine. We're able to use sqlline as well as Web UI in order to run SQL queries. Also, users that are in the root group have access to Storage, Threads and Logs tabs.

       

      Unfortunately, all the users have access to all tables/directories/views, regardless of the permissions set on the local file system. Furthermore, inspecting the Drill process with auditctl reveals that the Drill process user (root) is accessing the files instead of impersonating user as one would expect while using impersonation.

       

      Attaching with java debugger also reveals that even though it's local file system, Drill uses ```ProxyLocalFileSystem``` from hive-exec JAR in ```ImpersonationUtil.createFileSystem(...)```.

       

      Attachments

        Issue Links

          Activity

            People

              vitalii Vitalii Diravka
              kstyrc Krzysztof Styrc
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: