[HIVE-13819] Read & eXecute permissions on Database allows to ALTER it. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.2.1
Fix Version/s: None
Component/s: Authorization
Labels:
None
Environment:

Hadoop 2.7.2, Hive 1.2.1, Kerberos.

Description

Hi,

As the owner of an Hive database I can modify the Hive database metadata whereas I only has the read and execute permission on the Hive database repository.
I was expected to not be able to modify these metadata.

Context:

Hive database configured with the Storage Based Authorization strategy.
Hive client authorization is disabled.
Metastore side security is activated.

Permission configuration:

dr-x--x---   - hive9990    hive9990             0 2016-05-20 17:10 /path/to/hive/warehouse/p09990.db

ALTER command as hive9990 user:

hive (p09990)>  ALTER DATABASE p09990 SET DBPROPERTIES ('comment'='database altered');
OK
Time taken: 0.277 seconds
hive (p09990)> DESCRIBE DATABASE EXTENDED p09990;
OK
p09990          hdfs://path/to/hive/warehouse/p09990.db        hdfs    USER    {comment=database altered}

Configuration of hive-site.xml on the metastore:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
 
  <property>
      <name>hive.security.authorization.enabled</name>
      <value>false</value>
      <description>enable or disable the Hive client authorization</description>
  </property>

  <property>
      <name>hive.security.metastore.authorization.manager</name>
      <value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
      <description>authorization manager class name to be used in the metastore for authorization.
      The user defined authorization class should implement interface org.apache.hadoop.hive.ql.security.authorization.HiveMetastoreAuthorizationProvider.
      </description>
  </property>

  <property>
      <name>hive.metastore.pre.event.listeners</name>
      <value>org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener</value>
      <description>This turns on metastore-side security.
      </description>
  </property>

  <property>
      <name>hive.security.metastore.authorization.auth.reads</name>
      <value>true</value>
      <description>If this is true, the metastore authorizer authorizes read actions on database and table.
      </description>
  </property>

  <property>
      <name>hive.security.authorization.manager</name>
      <value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
      <description>The Hive client authorization manager class name.
  The user defined authorization class should implement interface org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProvider.
      </description>
  </property>

  <property>
      <name>hive.security.authorization.createtable.owner.grants</name>
      <value>ALL</value>
      <description>the privileges automatically granted to the owner whenever a table gets created. 
       An example like "select,drop" will grant select and drop privilege to the owner of the table</description>
  </property>

  <property>
      <name>hive.users.in.admin.role</name>
      <value>hdfs</value>
      <description>Comma separated list of users who are in admin role for bootstrapping.
    More users can be added in ADMIN role later.</description>
  </property>

  <property>
      <name>hive.metastore.warehouse.dir</name>
      <value>/path/to/hive/warehouse/</value>
      <description>location of default database for the warehouse</description>
  </property>

  <property>
      <name>hive.cli.print.current.db</name>
      <value>true</value>
      <description>Whether to include the current database in the Hive prompt.</description>
  </property>

  <property>
      <name>hive.metastore.uris</name>
      <value>thrift://hiveserver2http01:9083</value>
      <description>Thrift uri for the remote metastore. Used by metastore client to connect to remote metastore.</description>
  </property>

  <property>
      <name>javax.jdo.option.ConnectionDriverName</name>
      <value>com.mysql.jdbc.Driver</value>
      <description>JDBC Driver</description>
  </property>

  <property>
      <name>javax.jdo.option.ConnectionURL</name>
      <value>jdbc:mysql://hivedb01/metastore</value>
      <description>JDBC connect string for a JDBC metastore</description>
  </property>

  <property>
      <name>javax.jdo.option.ConnectionUserName</name>
      <value>metastore</value>
      <description>username to use against metastore database</description>
  </property>

  <property>
      <name>javax.jdo.option.ConnectionPassword</name>
      <value>********</value>
      <description>password to use against metastore database</description>
  </property>

  <property>
      <name>datanucleus.autoCreateSchema</name>
      <value>false</value>
      <description>creates necessary schema on a startup if one doesn't exist. set this to false, after creating it once</description>
  </property>

  <property>
      <name>hive.metastore.authorization.storage.checks</name>
      <value>true</value>
      <description>Should the metastore do authorization checks against the underlying storage
  for operations like drop-partition (disallow the drop-partition if the user in
  question doesn't have permissions to delete the corresponding directory
  on the storage).</description>
  </property>

  <property>
      <name>hive.metastore.sasl.enabled</name>
      <value>true</value>
      <description>If true, the metastore thrift interface will be secured with SASL. Clients must authenticate with Kerberos.</description>
  </property>

  <property>
      <name>hive.metastore.kerberos.keytab.file</name>
      <value>/path/to/metastore.keytab</value>
      <description>The path to the Kerberos Keytab file containing the metastore thrift server's service principal.</description>
  </property>

  <property>
      <name>hive.metastore.kerberos.principal</name>
      <value>primary/instance@realm</value>
      <description>The service principal for the metastore thrift server. The special string _HOST will be replaced automatically with the correct host name.</description>
  </property>

  <property>
      <name>hive.server2.max.start.attempts</name>
      <value>30</value>
      <description>This number of times HiveServer2 will attempt to start before exiting, sleeping 60 seconds between retries. The default of 30 will keep trying for 30 minutes.</description>
  </property>

  <property>
      <name>hive.server2.transport.mode</name>
      <value>binary</value>
      <description>Server transport mode. "binary" or "http".</description>
  </property>

  <property>
      <name>hive.server2.thrift.http.port</name>
      <value>10001</value>
      <description>Port number when in HTTP mode.</description>
  </property>

  <property>
      <name>hive.server2.thrift.http.path</name>
      <value>bdcorp</value>
      <description>Path component of URL endpoint when in HTTP mode.</description>
  </property>

  <property>
      <name>hive.server2.use.SSL</name>
      <value>false</value>
      <description>Set this to true for using SSL encryption in HiveServer2</description>
  </property>

  <property>
      <name>hive.server2.keystore.path</name>
      <value></value>
      <description>SSL certificate keystore location</description>
  </property>

  <property>
      <name>hive.server2.keystore.password</name>
      <value></value>
      <description>SSL certificate keystore password.</description>
  </property>

  <property>
      <name>hive.server2.authentication.pam.services</name>
      <value></value>
      <description>List of the underlying pam services that should be used when auth type is PAM.
  A file with the same name must exist in /etc/pam.d</description>
  </property>

  <property>
      <name>hive.server2.thrift.min.worker.threads</name>
      <value>5</value>
      <description>Minimum number of Thrift worker threads</description>
  </property>

  <property>
      <name>hive.server2.thrift.max.worker.threads</name>
      <value>500</value>
      <description>Maximum number of Thrift worker threads</description>
  </property>

  <property>
      <name>hive.server2.thrift.worker.keepalive.time</name>
      <value>60</value>
      <description>Keepalive time (in seconds) for an idle worker thread. 
    When number of workers > min workers, excess threads are killed after this time interval.
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.http.cookie.auth.enabled</name>
      <value>true</value>
      <description>When true, HiveServer2 in HTTP transport mode will use cookie based authentication mechanism.
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.http.cookie.max.age</name>
      <value>86400s</value>
      <description>Maximum age in seconds for server side cookie used by HiveServer2 in HTTP mode.
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.http.cookie.path</name>
      <value></value>
      <description>Path for the HiveServer2 generated cookies.
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.http.cookie.domain</name>
      <value></value>
      <description>Domain for the HiveServer2 generated cookies.
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.http.cookie.is.secure</name>
      <value>true</value>
      <description>Secure attribute of the HiveServer2 generated cookie.
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.http.cookie.is.httponly</name>
      <value>true</value>
      <description>HttpOnly attribute of the HiveServer2 generated cookie.
      </description>
  </property>

  <property>
      <name>hive.server2.async.exec.threads</name>
      <value>100</value>
      <description>Number of threads in the async thread pool for HiveServer2</description>
  </property>

  <property>
      <name>hive.server2.async.exec.shutdown.timeout</name>
      <value>10</value>
      <description>Time (in seconds) for which HiveServer2 shutdown will wait for async
  threads to terminate</description>
  </property>

  <property>
      <name>hive.server2.async.exec.keepalive.time</name>
      <value>10</value>
      <description>Time (in seconds) that an idle HiveServer2 async thread (from the thread pool) will wait
  for a new task to arrive before terminating</description>
  </property>

  <property>
      <name>hive.server2.long.polling.timeout</name>
      <value>5000</value>
      <description>Time in milliseconds that HiveServer2 will wait, before responding to asynchronous calls that use long polling</description>
  </property>

  <property>
      <name>hive.server2.async.exec.wait.queue.size</name>
      <value>100</value>
      <description>Size of the wait queue for async thread pool in HiveServer2.
  After hitting this limit, the async thread pool will reject new requests.</description>
  </property>

  <property>
      <name>hive.server2.thrift.port</name>
      <value>10000</value>
      <description>Port number of HiveServer2 Thrift interface.
  Can be overridden by setting $HIVE_SERVER2_THRIFT_PORT</description>
  </property>

  <property>
      <name>hive.server2.thrift.bind.host</name>
      <value>hiveserver2http01</value>
      <description>Bind host on which to run the HiveServer2 Thrift interface.
  Can be overridden by setting $HIVE_SERVER2_THRIFT_BIND_HOST</description>
  </property>

  <property>
      <name>hive.server2.authentication</name>
      <value>KERBEROS</value>
      <description>
    Client authentication types.
       NONE: no authentication check
       LDAP: LDAP/AD based authentication
       KERBEROS: Kerberos/GSSAPI authentication
       CUSTOM: Custom authentication provider
               (Use with property hive.server2.custom.authentication.class)
       PAM: Pluggable authentication module.
      </description>
  </property>

  <property>
      <name>hive.server2.custom.authentication.class</name>
      <value></value>
      <description>
    Custom authentication class. Used when property
    'hive.server2.authentication' is set to 'CUSTOM'. Provided class
    must be a proper implementation of the interface
    org.apache.hive.service.auth.PasswdAuthenticationProvider. HiveServer2
    will call its Authenticate(user, passed) method to authenticate requests.
    The implementation may optionally extend Hadoop's
    org.apache.hadoop.conf.Configured class to grab Hive's Configuration object.
      </description>
  </property>

  <property>
      <name>hive.server2.authentication.kerberos.principal</name>
      <value>primary/instance@realm</value>
      <description>
    Kerberos server principal
      </description>
  </property>

  <property>
      <name>hive.server2.authentication.kerberos.keytab</name>
      <value>/path/to/hiveserver2.keytab</value>
      <description>
    Kerberos keytab file for server principal
      </description>
  </property>

  <property>
      <name>hive.server2.authentication.spnego.principal</name>
      <value>primary/instance@realm</value>
      <description>
    SPNego service principal, optional,
    typical value would look like HTTP/_HOST@EXAMPLE.COM
    SPNego service principal would be used by hiveserver2 when kerberos security is enabled
    and HTTP transport mode is used.
    This needs to be set only if SPNEGO is to be used in authentication.
      </description>
  </property>

  <property>
      <name>hive.server2.authentication.spnego.keytab</name>
      <value>/path/to/spnego.keytab</value>
      <description>
    keytab file for SPNego principal, optional,
    typical value would look like /etc/security/keytabs/spnego.service.keytab,
    This keytab would be used by hiveserver2 when kerberos security is enabled
    and HTTP transport mode is used.
    This needs to be set only if SPNEGO is to be used in authentication.
    SPNego authentication would be honored only if valid
    hive.server2.authentication.spnego.principal
    and
    hive.server2.authentication.spnego.keytab
    are specified
      </description>
  </property>

  <property>
      <name>hive.server2.authentication.ldap.url</name>
      <value>setindatabag</value>
      <description>
    LDAP connection URL
      </description>
  </property>

  <property>
      <name>hive.server2.authentication.ldap.baseDN</name>
      <value>setindatabag</value>
      <description>
    LDAP base DN
      </description>
  </property>

  <property>
      <name>hive.server2.enable.doAs</name>
      <value>true</value>
      <description>
   Setting this property to true will have HiveServer2 execute
    Hive operations as the user making the calls to it.
      </description>
  </property>

  <property>
      <name>hive.execution.engine</name>
      <value>mr</value>
      <description>
    Chooses execution engine. Options are: mr (Map reduce, default) or tez (hadoop 2 only)
      </description>
  </property>

  <property>
      <name>hive.mapjoin.optimized.hashtable</name>
      <value>true</value>
      <description>Whether Hive should use a memory-optimized hash table for MapJoin. 
    Only works on Tez, because memory-optimized hash table cannot be serialized.
      </description>
  </property>

  <property>
      <name>hive.mapjoin.optimized.hashtable.wbsize</name>
      <value>10485760</value>
      <description>Optimized hashtable (see hive.mapjoin.optimized.hashtable) uses a chain of buffers to store data. 
    This is one buffer size. Hashtable may be slightly faster if this is larger, 
    but for small joins unnecessary memory will be allocated and then trimmed.
      </description>
  </property>

  <property>
      <name>hive.prewarm.enabled</name>
      <value>false</value>
      <description>
    Enables container prewarm for tez (hadoop 2 only)
      </description>
  </property>

  <property>
      <name>hive.prewarm.numcontainers</name>
      <value>10</value>
      <description>
    Controls the number of containers to prewarm for tez (hadoop 2 only)
      </description>
  </property>

  <property>
      <name>hive.server2.table.type.mapping</name>
      <value>CLASSIC</value>
      <description>
   This setting reflects how HiveServer2 will report the table types for JDBC and other
   client implementations that retrieve the available tables and supported table types
     HIVE : Exposes Hive's native table types like MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW
     CLASSIC : More generic types like TABLE and VIEW
      </description>
  </property>

  <property>
      <name>hive.server2.thrift.sasl.qop</name>
      <value>auth</value>
      <description>Sasl QOP value; Set it to one of following values to enable higher levels of
     protection for HiveServer2 communication with clients.
      "auth" - authentication only (default)
      "auth-int" - authentication plus integrity protection
      "auth-conf" - authentication plus integrity and confidentiality protection
     This is applicable only if HiveServer2 is configured to use Kerberos authentication.
      </description>
  </property>

  <property>
      <name>hive.tez.container.size</name>
      <value>-1</value>
      <description>By default tez will spawn containers of the size of a mapper. This can be used to overwrite.</description>
  </property>

  <property>
      <name>hive.tez.java.opts</name>
      <value></value>
      <description>By default tez will use the java opts from map tasks. This can be used to overwrite.</description>
  </property>

  <property>
      <name>hive.tez.log.level</name>
      <value>INFO</value>
      <description>
    The log level to use for tasks executing as part of the DAG.
    Used only if hive.tez.java.opts is used to configure java opts.
      </description>
  </property>

  <property>
      <name>hive.tez.smb.number.waves</name>
      <value>1</value>
      <description>The number of waves in which to run the SMB (sort-merge-bucket) join. 
    Account for cluster being occupied. Ideally should be 1 wave.
      </description>
  </property>

  <property>
      <name>hive.tez.cpu.vcores</name>
      <value>-1</value>
      <description>By default Tez will ask for however many CPUs MapReduce is configured to use per container. 
    This can be used to overwrite the default.
      </description>
  </property>

  <property>
      <name>hive.tez.auto.reducer.parallelism</name>
      <value>false</value>
      <description>Turn on Tez' auto reducer parallelism feature. When enabled, Hive will still estimate data sizes and set parallelism estimates. 
    Tez will sample source vertices' output sizes and adjust the estimates at runtime as necessary.
      </description>
  </property>

  <property>
      <name>hive.auto.convert.join</name>
      <value>true</value>
      <description>
      </description>
  </property>

  <property>
      <name>hive.auto.convert.join.noconditionaltask</name>
      <value>true</value>
      <description>
      </description>
  </property>

  <property>
      <name>hive.auto.convert.join.noconditionaltask.size</name>
      <value>1</value>
      <description>
      </description>
  </property>

  <property>
      <name>hive.vectorized.execution.enabled</name>
      <value>true</value>
      <description>This flag should be set to true to enable vectorized mode of query execution. The default value is false.
      </description>
  </property>

  <property>
      <name>hive.vectorized.execution.reduce.enabled</name>
      <value>false</value>
      <description>This flag should be set to true to enable vectorized mode of the reduce-side of query execution. The default value is true.
      </description>
  </property>

  <property>
      <name>hive.cbo.enable</name>
      <value>true</value>
      <description>When true, the cost based optimizer, which uses the Calcite framework, will be enabled.
      </description>
  </property>

  <property>
      <name>hive.fetch.task.conversion</name>
      <value>more</value>
      <description>Some select queries can be converted to a single FETCH task, minimizing latency. 
    Currently the query should be single sourced not having any subquery and should not have any aggregations or distincts 
    (which incur RS – ReduceSinkOperator, requiring a MapReduce task), lateral views and joins.
      </description>
  </property>

  <property>
      <name>hive.fetch.task.conversion.threshold</name>
      <value>1073741824</value>
      <description>Input threshold (in bytes) for applying hive.fetch.task.conversion. 
    If target table is native, input length is calculated by summation of file lengths. 
    If it's not native, the storage handler for the table can optionally implement the org.apache.hadoop.hive.ql.metadata.InputEstimator interface. 
    A negative threshold means hive.fetch.task.conversion is applied without any input length threshold.
      </description>
  </property>

  <property>
      <name>hive.fetch.task.aggr</name>
      <value>false</value>
      <description>Aggregation queries with no group-by clause (for example, select count(*) from src) execute final aggregations in a single reduce task.
    If this parameter is set to true, Hive delegates the final aggregation stage to a fetch task, possibly decreasing the query time.
      </description>
  </property>

  <property>
      <name>hive.spark.job.monitor.timeout</name>
      <value>60</value>
      <description>Timeout for job monitor to get Spark job state.
      </description>
  </property>

  <property>
      <name>hive.spark.client.future.timeout</name>
      <value>60</value>
      <description>Timeout for requests from Hive client to remote Spark driver.
      </description>
  </property>

  <property>
      <name>hive.spark.client.connect.timeout</name>
      <value>1000</value>
      <description>Timeout for remote Spark driver in connecting back to Hive client.
      </description>
  </property>

  <property>
      <name>hive.spark.client.channel.log.level</name>
      <value></value>
      <description>Channel logging level for remote Spark driver. One of DEBUG, ERROR, INFO, TRACE, WARN. If unset, TRACE is chosen.
      </description>
  </property>

  <property>
      <name>hive.server2.tez.default.queues</name>
      <value></value>
      <description>
    A list of comma separated values corresponding to yarn queues of the same name.
    When hive server 2 is launched in tez mode, this configuration needs to be set
    for multiple tez sessions to run in parallel on the cluster.
      </description>
  </property>

  <property>
      <name>hive.server2.tez.sessions.per.default.queue</name>
      <value>1</value>
      <description>
    A positive integer that determines the number of tez sessions that should be
    launched on each of the queues specified by "hive.server2.tez.default.queues".
    Determines the parallelism on each queue.
      </description>
  </property>

  <property>
      <name>hive.server2.tez.initialize.default.sessions</name>
      <value>false</value>
      <description>
    This flag is used in hive server 2 to enable a user to use hive server 2 without
    turning on tez for hive server 2. The user could potentially want to run queries
    over tez without the pool of sessions.
      </description>
  </property>

  <property>
      <name>hive.support.sql11.reserved.keywords</name>
      <value>true</value>
      <description>Whether to enable support for SQL2011 reserved keywords. When enabled, will support (part of) SQL2011 reserved keywords.
      </description>
  </property>

  <property>
      <name>hive.aux.jars.path</name>
      <value></value>
      <description>A comma separated list (with no spaces) of the jar files</description>
  </property>

</configuration>

Best regards.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Alexandre Linte

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 23/May/16 13:06

Updated:: 02/Jun/16 16:18