Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24342

isPathEncrypted should make sure resolved path also from HDFS

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      Currently isPathEncrypted will make sure path is from hdfs by checking the path scheme is "hdfs"

      In the case if mounted ViewFileSystem based files systems like ViewFSOverloadScheme or ViewHDFS (HDFS-15289) may need o check resolved path is really hdfs.

      In ViewHDFS case, we can mount hdfs://ns1/test ---> o3fs://b.v.ozone1/test

      When user calling queries with the path hdfs://ns1/test, isPathEncrypted will think the path is from hdfs only as its checking path scheme.

       

      0: jdbc:hive2://umag-1.umag.root.xxx.site:218> select * from test30;
      Error: Error while compiling statement: FAILED: SemanticException Unable to determine if hdfs://ns1/test is encrypted: java.lang.UnsupportedOperationException: This API:getEZForPath is specific to DFS. Can't run on other fs:o3fs://bucket.volume.ozone1 (state=42000,code=40000)
      0: jdbc:hive2://umag-1.umag.root.xxx.site:218> cd Closing: 0: jdbc:hive2://umag-1.umag.root.xxx.site:2181,umag-2.umag.root.xxx.site:2181,umag-5.umag.root.xxx.site:2181/default;password=root;principal=hive/umag-5.umag.root.xxx.site@ROOT.HWX.SITE;retries=5;serviceDiscoveryMode=zooKeeper;user=root;zooKeeperNamespace=hiveserver2
      

       

      So, here we should use resolvePath to make sure the resolved path really in hdfs. If the resolved path is not from hdfs (in above case, it o3fs path), then it will return false.

      After fixing this, the query is passing.:

       

      0: jdbc:hive2://umag-1.umag.root.xxx.site:218> select * from test30;
      INFO  : Compiling command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb): select * from test30
      INFO  : No Stats for default@test30, Columns: item, user_id, state, order_id
      INFO  : Semantic Analysis Completed (retrial = false)
      INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:test30.order_id, type:bigint, comment:null), FieldSchema(name:test30.user_id, type:string, comment:null), FieldSchema(name:test30.item, type:string, comment:null), FieldSchema(name:test30.state, type:string, comment:null)], properties:null)
      INFO  : Completed compiling command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb); Time taken: 4.47 seconds
      INFO  : Executing command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb): select * from test30
      INFO  : Completed executing command(queryId=hive_20201031002253_1691548f-6fa8-4ea9-9cd4-87b70fe8f6bb); Time taken: 0.09 seconds
      INFO  : OK
      +------------------+-----------------+--------------+---------------+
      | test30.order_id  | test30.user_id  | test30.item  | test30.state  |
      +------------------+-----------------+--------------+---------------+
      | 1234             | u1              | iphone7      | CA            |
      | 2345             | u1              | ipad         | CA            |
      | 3456             | u2              | desktop      | NY            |
       
       
      +------------------+-----------------+--------------+---------------+
      11 rows selected (6.975 seconds)
      

       

      Attachments

        Issue Links

          Activity

            People

              umamaheswararao Uma Maheswara Rao G
              umamaheswararao Uma Maheswara Rao G
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m