Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-12771

InvalidPathException on Windows when removing staging directory

Details

    • Bug
    • Status: Open
    • P3
    • Resolution: Unresolved
    • 2.31.0
    • None
    • None
    • Windows

    Description

      When running the word count example on windows using e.g. the Flink runner an InvalidPathException is thrown when execution is finished. The message is:

       

      INFO:apache_beam.utils.subprocess_server:b'WARNUNG: Failed to remove job staging directory for token job_941a4c92-f66d-4d71-8b68-b16cd5026750.'
      INFO:apache_beam.utils.subprocess_server:b'java.nio.file.InvalidPathException: Illegal char <*> at index 136: C:\\Users\\Thomas\\AppData\\Local\\Temp\\beam-tempgkbgwplc\\artifactsuu86zi08\\d82c20307c13adfba9486c5a114a464cc3fb072ad60623a58839256a91a6c9f8\\*'
      INFO:apache_beam.utils.subprocess_server:b'\tat sun.nio.fs.WindowsPathParser.normalize(Unknown Source)'
      INFO:apache_beam.utils.subprocess_server:b'\tat sun.nio.fs.WindowsPathParser.parse(Unknown Source)'
      INFO:apache_beam.utils.subprocess_server:b'\tat sun.nio.fs.WindowsPathParser.parse(Unknown Source)'
      INFO:apache_beam.utils.subprocess_server:b'\tat sun.nio.fs.WindowsPath.parse(Unknown Source)'
      INFO:apache_beam.utils.subprocess_server:b'\tat sun.nio.fs.WindowsFileSystem.getPath(Unknown Source)'
      INFO:apache_beam.utils.subprocess_server:b'\tat java.nio.file.Paths.get(Unknown Source)'
      INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.sdk.io.LocalResourceId.resolveLocalPathWindowsOS(LocalResourceId.java:103)'
      INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.sdk.io.LocalResourceId.resolve(LocalResourceId.java:65)'
      INFO:apache_beam.runners.portability.portable_runner:Job state changed to DONE
      INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.sdk.io.LocalResourceId.resolve(LocalResourceId.java:36)'
      INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.runners.fnexecution.artifact.ArtifactStagingService$1.removeStagedArtifacts(ArtifactStagingService.java:182)'
      INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.runners.fnexecution.artifact.ArtifactStagingService.removeStagedArtifacts(ArtifactStagingService.java:115)'
      INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.runners.jobsubmission.JobServerDriver.lambda$createJobService$0(JobServerDriver.java:66)'
      INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.runners.jobsubmission.InMemoryJobService.lambda$run$0(InMemoryJobService.java:261)'
      INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.runners.jobsubmission.JobInvocation.setState(JobInvocation.java:249)'
      INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.runners.jobsubmission.JobInvocation.access$200(JobInvocation.java:51)'
      INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.runners.jobsubmission.JobInvocation$1.onSuccess(JobInvocation.java:115)'
      INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.runners.jobsubmission.JobInvocation$1.onSuccess(JobInvocation.java:101)'
      INFO:apache_beam.utils.subprocess_server:b'\tat org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1058)'
      INFO:apache_beam.utils.subprocess_server:b'\tat java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)'
      INFO:apache_beam.utils.subprocess_server:b'\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)'
      INFO:apache_beam.utils.subprocess_server:b'\tat java.lang.Thread.run(Unknown Source)'

      The error seems to be in this function:

      private LocalResourceId resolveLocalPathWindowsOS(String other, ResolveOptions resolveOptions) {
          String uuid = UUID.randomUUID().toString();
          Path pathAsterisksReplaced = Paths.get(pathString.replaceAll("\\*", uuid));
          String otherAsterisksReplaced = other.replaceAll("\\*", uuid);
          return new LocalResourceId(
              Paths.get(
                  pathAsterisksReplaced
                      .resolve(otherAsterisksReplaced)
                      .toString()
                      .replaceAll(uuid, "\\*")),
              resolveOptions.equals(StandardResolveOptions.RESOLVE_DIRECTORY));
        }
      

      Paths.get throws an exception since it does not support wildcards on windows. It seems the function already takes care of replaceing the wildcard with 'uuid' on the first call to Paths.get, but then in the return statement Paths.get is called again on a string where uuid is replaced with the wildcard again, which of course throws the exception.

      Unfortunately I don't really understand the logic of the function, so I'm not sure what the best fix would be.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            aKzenT Thomas Krause
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: