Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-18179

Boost S3A Stream Read Performance

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3.2
    • None
    • fs/s3
    • None

    Description

      calibrate S3A input stream performance against recent applications/data formats and improve where necessary.

      HADOOP-18028 is a key part of this, but there are other issues/opertunities

      1. we could add machine parsable trace-level logging in FSDataInputStream to collect stats on how stream apis are invoked, so collect data from real apps; analyze
      2. implement those APIs which some apps use (ByteBufferPositionedReadable), not so much for direct implementation as to get better information from the app as its read plan
      3. the `normal` mode doesn't switch from sequential on forward seeks. Is that always appropriate?
      4. choose different buffering options when doing whole file IO vs sequential vs random

      Attachments

        Issue Links

          Activity

            People

              ahmar Ahmar Suhail
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated: