Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-18610

ABFS OAuth2 Token Provider to support Azure Workload Identity for AKS

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 3.3.4
    • None
    • tools

    Description

      In Jan 2023, Microsoft Azure AKS replaced its original pod-managed identity with with Azure Active Directory (Azure AD) workload identities (preview), which integrate with the Kubernetes native capabilities to federate with any external identity providers. This approach is simpler to use and deploy.

      Refer to https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview and https://azure.github.io/azure-workload-identity/docs/introduction.html for more details.

      The basic use scenario is to access Azure cloud resources (such as cloud storage) from Kubernetes (such as AKS) workload using Azure managed identity federated with Kubernetes service account. The credential environment variables in pod projected by Azure AD workload identity are like following:

      AZURE_AUTHORITY_HOST: (Injected by the webhook, https://login.microsoftonline.com/)

      AZURE_CLIENT_ID: (Injected by the webhook)

      AZURE_TENANT_ID: (Injected by the webhook)

      AZURE_FEDERATED_TOKEN_FILE: (Injected by the webhook, /var/run/secrets/azure/tokens/azure-identity-token)

      The token in the file pointed by AZURE_FEDERATED_TOKEN_FILE is a JWT (JASON Web Token) client assertion token which we can use to request to AZURE_AUTHORITY_HOST (url is  AZURE_AUTHORITY_HOST + tenantId + "/oauth2/v2.0/token")  for a AD token which can be used to directly access the Azure cloud resources.

      This approach is very common and similar among cloud providers such as AWS and GCP. Hadoop AWS integration has WebIdentityTokenCredentialProvider to handle the same case.

      The existing MsiTokenProvider can only handle the managed identity associated with Azure VM instance. We need to implement a WorkloadIdentityTokenProvider which handle Azure Workload Identity case. For this, we need to add one method (getTokenUsingJWTAssertion) in AzureADAuthenticator which will be used by WorkloadIdentityTokenProvider.

       

      Attachments

        1. HADOOP-18610-preview.patch
          12 kB
          Haifeng Chen

        Issue Links

          Activity

            People

              anujmodi2021 Anuj Modi
              haifengchen Haifeng Chen
              Votes:
              3 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - 168h
                  168h
                  Remaining:
                  Remaining Estimate - 168h
                  168h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified