Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-14450

ADLS Python client inconsistent when used in tandem with AdlFileSystem

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • fs/adl

    Description

      Impala uses the AdlFileSystem connector to talk to ADLS. As a part of the Impala tests, we drop tables and verify that the files belonging to that table have been dropped for all filesystems that Impala supports. These tests however, fail with ADLS.

      If I use the Hadoop ADLS connector to delete a file, and then list the parent directory of that file using the above Python client within the second, the client still says that the file is available in ADLS.

      This is the Python client from Microsoft that we're using in our testing:
      https://github.com/Azure/azure-data-lake-store-python

      Their release notes say that it's still a "pre-release preview":
      https://github.com/Azure/azure-data-lake-store-python/releases

      Questions for the ADLS folks:

      Is this a known issue? If so, will it be fixed soon?
      Or is this expected behavior?

      I'm able to deterministically reproduce it in my tests, with Impala on ADLS.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            ASikaria Atul Sikaria
            sailesh Sailesh Mukil
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment