Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.3.1, 3.3.2, 3.3.5, 3.3.3, 3.3.4, 3.3.6
-
Reviewed
Description
I am using hadoop-azure-3.3.0.jar and have written code:
static final String ROOT_DIR = "abfs://ssh-test-fs@sshadlsgen2.dfs.core.windows.net", Configuration config = new Configuration(); config.set("fs.defaultFS",ROOT_DIR); config.set("fs.adl.oauth2.access.token.provider.type","ClientCredential"); config.set("fs.adl.oauth2.client.id",""); config.set("fs.adl.oauth2.credential",""); config.set("fs.adl.oauth2.refresh.url",""); config.set("fs.azure.account.key.sshadlsgen2.dfs.core.windows.net",ACCESS_TOKEN); config.set("fs.azure.skipUserGroupMetadataDuringInitialization","true"); FileSystem fs = FileSystem.get(config); System.out.println( "\nfs:'"+fs.toString()+"'"); FileStatus status = fs.getFileStatus(new Path(ROOT_DIR)); // !!! Exception in 3.3.1-* System.out.println( "\nstatus:'"+status.toString()+"'");
It did work properly till 3.3.1.
But in 3.3.1 it fails with exception:
Caused by: Operation failed: "Value for one of the query parameters specified in the request URI is invalid.", 400, HEAD, https://sshadlsgen2.dfs.core.windows.net/ssh-test-fs?upn=false&action=getAccessControl&timeout=90 at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.completeExecute(AbfsRestOperation.java:218) at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.lambda$execute$0(AbfsRestOperation.java:181) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.measureDurationOfInvocation(IOStatisticsBinding.java:494) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDurationOfInvocation(IOStatisticsBinding.java:465) at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:179) at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAclStatus(AbfsClient.java:942) at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAclStatus(AbfsClient.java:924) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getFileStatus(AzureBlobFileSystemStore.java:846) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:507)
I performed some research and found:
In hadoop-azure-3.3.0.jar we see:
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore{ ... public FileStatus getFileStatus(final Path path) throws IOException { ... Line 604: op = client.getAclStatus(AbfsHttpConstants.FORWARD_SLASH + AbfsHttpConstants.ROOT_PATH); ... } ... }
and this code produces REST request:
https://sshadlsgen2.dfs.core.windows.net/ssh-test-fs//?upn=false&action=getAccessControl&timeout=90
There is finalizes slash in path part "...ssh-test-fs//?upn=false..." This request does work properly.
But since hadoop-azure-3.3.1.jar till latest hadoop-azure-3.3.6.jar we see:
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore { ... public FileStatus getFileStatus(final Path path) throws IOException { ... perfInfo.registerCallee("getAclStatus"); Line 846: op = client.getAclStatus(getRelativePath(path)); ... } ... } Line 1492: private String getRelativePath(final Path path) { ... return path.toUri().getPath(); }
and this code prduces REST request:
https://sshadlsgen2.dfs.core.windows.net/ssh-test-fs?upn=false&action=getAccessControl&timeout=90
There is not finalizes slash in path part "...ssh-test-fs?upn=false..." It happens because the new code "path.toUri().getPath();" produces empty string.
This request fails with message:
Caused by: Operation failed: "Value for one of the query parameters specified in the request URI is invalid.", 400, HEAD, https://sshadlsgen2.dfs.core.windows.net/ssh-test-fs?upn=false&action=getAccessControl&timeout=90 at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.completeExecute(AbfsRestOperation.java:218) at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.lambda$execute$0(AbfsRestOperation.java:181) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.measureDurationOfInvocation(IOStatisticsBinding.java:494) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDurationOfInvocation(IOStatisticsBinding.java:465) at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:179) at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAclStatus(AbfsClient.java:942) at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAclStatus(AbfsClient.java:924) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getFileStatus(AzureBlobFileSystemStore.java:846) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:507)
Such us it is for all hadoop-azure-3.3..jar versions which does use log4j 2. not 1.2.17 we can't update using version
I attach a sample of Maven project to try: test_hadoop-azure-3_3_1-FileSystem_getFileStatus - Copy.zip
Attachments
Attachments
Issue Links
- is caused by
-
HADOOP-16612 Track Azure Blob File System client-perceived latency
- Resolved
-
HADOOP-16916 ABFS: Delegation SAS generator for integration with Ranger
- Resolved
- links to