Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12154

Load data inpath 'PATTERN' into table should only check files match the PATTERN

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 0.13.1, 1.0.0, 1.1.0, 1.2.0, 1.2.1
    • None
    • None

    Description

      We are using flume to sink data to HDFS directory '/tmp/test/', temporal files that flume actively writes into has a suffix .tmp, after writes finish, the file will be renamed to SAMPLE.data.

      Hive periodic task execute script like

      load data inpath '/tmp/test/*.data' into table t1;

      This exception happens sometimes

      2015-10-12 19:38:00,133 | ERROR | HiveServer2-Handler-Pool: Thread-57 | FAILED: HiveAuthzPluginException Error getting permissions for hdfs://hacluster/tmp/test/*.data: null
      org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthzPluginException: Error getting permissions for hdfs://hacluster/tmp/test/*.data: null
      ...
      Caused by: java.io.FileNotFoundException: Path not found
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:8175)

      I digged into the code, and found that SQLStdHiveAuthorizationValidator checks all the files in /tmp/test/ directory, but when checks the permission of .tmp file, the file is renamed to .data, hdfs cannot find this file.

      Attachments

        Activity

          People

            Unassigned Unassigned
            niklaus.xiao Niklaus Xiao
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: