Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-10295

FileBasedSink: allow setting temp directory provider per dynamic destination

Details

    Description

      Dynamic file destinations allow value-dependent writes in FileBasedSink. When using hadoop file system this means user can write some values to destination at cluster-A and some values to destination at cluster-B.

      Since BEAM-7613 was fixed this works fine until the moveToOutputFiles method is called. This method internally calls FileSystems.rename which obviously requires that source files (temporary files) and target files (resolved by dynamic destination's function) are on the same cluster. But the temp directory provider can be set only one per file sink.

      This could be fixed by adding some kind of getTempDirectoryProvider method into dynamic destinations (e.g. into DefaultFilenamePolicy.Params).

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            davidak09 David Janicek
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: