Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-17981

Support etag-assisted renames in FileOutputCommitter

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 3.4.0
    • None
    • fs, fs/azure

    Description

      To deal with some throttling/retry issues in object stores,
      pass the FileStatus entries retrieved during listing
      into a private interface ResilientCommitByRename which filesystems
      may implement to use extra attributes in the listing (etag, version)
      to constrain and validate the operation.

      Although targeting azure, GCS and others could use. no point in S3A as they shouldn't use this committer.

      1. And we are not going to do any changes to FileSystem as there are explicit guarantees of public use and stability.
        I am not going to make that change as the hive thing that will suddenly start expecting it to work forever.
      2. I'm not planning to merge this in, as the manifest committer is going to include this and more (MAPREDUCE-7341)

      However, I do need to get this in on a branch, so am doing this work on trunk for dev & test and for others to review

      Attachments

        Issue Links

          Activity

            People

              stevel@apache.org Steve Loughran
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 12h 50m
                  12h 50m