Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-18103 High performance vectored read API in Hadoop
  3. HADOOP-18391

Improve VectoredReadUtils#readVectored() for direct buffers

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.5
    • 3.3.5
    • fs

    Description

      harden the VectoredReadUtils methods for consistent and more robust use, especially in those filesystems which don't have the api.

      VectoredReadUtils.readInDirectBuffer should allocate a max buffer size, .e.g 4mb, then do repeated reads and copies; this ensures that you don't OOM with many threads doing ranged requests. other libs do this.

      readVectored to call validateNonOverlappingAndReturnSortedRanges before iterating

      this ensures the abfs/s3a requirements are always met, and that because ranges will be read in order, prefetching by other clients will keep their performance good.

      readVectored to add special handling for 0 byte ranges

      Attachments

        Issue Links

          Activity

            People

              mukund-thakur Mukund Thakur
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: