Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11734

Hive Server2 not impersonating HDFS for CREATE TABLE/DATABASE with KERBEROS auth

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.1.1
    • None
    • Authorization
    • None

    Description

      My configuration is as follows:

      hive-site.xml:
      hive.server2.enable.doAs=true
      hive.metastore.execute.setugi=true
      hive.security.metastore.authorization.auth.reads=true
      hive.metastore.sasl.enabled=true
      hive.server2.authentication=KERBEROS
      hive.server2.thrift.sasl.qop=auth-conf
      hive.warehouse.subdir.inherit.perms=false
      ...
      
      hdfs-site.xml:
      dfs.block.access.token.enable=true
      fs.permissions.umask-mode=027
      ...
      
      core-site.xml:
      hadoop.security.authentication=kerberos
      hadoop.security.authorization=true
      hadoop.proxyuser.hive.hosts=localhost,master
      hadoop.proxyuser.hive.groups=*
      ...
      

      When I create a database or a table using Kerberos authorised (kinit) user account and beeline (shell) the HDFS directories created by Hive are owned by 'hive' user and group is same as for parent directory ('data' in my case) ('hive' user does not even belong to that group at all but it is in supergroup).

      Now when I try to load the data (or do any other map-reduce) the table files end up owned as the kinit'ed user and the actual user running Yarn container is the kinit'ed user (not 'hive').

      This is causing a permission issues when I run queries that do map-reduce since I don't own the database and table directories.
      Also this allows anybody to drop my database/table since this operation is performed as 'hive' user which is in the supergroup.

      What I want to get is DDL queries to use kinit'ed user when accessing HDFS so database/table directories end up being owned as that user.

      Is this a bug or configuration problem?

      Also the group should be users primary group (inherit.perms=false) and not group of the parent directory. This way I can use owner/group authorisation on HDFS to grant/restrict access using groups.

      As it stands it is serious security issue and also renders the whole doAs/impersonation system useless for me.

      Also see my question on Serverfault:
      http://serverfault.com/questions/717483/hive-server2-not-impersonating-hdfs

      Versions:

      hadoop-0.20-mapreduce-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
      hadoop-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
      hadoop-client-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
      hadoop-hdfs-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
      hadoop-hdfs-namenode-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
      hadoop-mapreduce-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
      hadoop-mapreduce-historyserver-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
      hadoop-yarn-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
      hadoop-yarn-resourcemanager-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
      hive-1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6.noarch
      hive-jdbc-1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6.noarch
      hive-metastore-1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6.noarch
      hive-server2-1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6.noarch
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            jpastuszek Jakub Pastuszek
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: