Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-2138

Support atomic rename within FileSystem to replace inefficient Hadoop copy

Details

    • Improvement
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      Hadoop copy operation is inefficient since it needs to stream the entirety of the resource through the machine performing the copy. Hadoop file system implementations do support an efficient rename.

      Apache Beam sinks rely on being able to rename files atomically which is currently done by using FileSystem copy + delete.

      Attachments

        Activity

          People

            Unassigned Unassigned
            lcwik Luke Cwik
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: