Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3732

Reduce Object size of InputAttemptIdentifier and MapOutput for large jobs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.9.0
    • None
    • None

    Description

      Objects in 64bit java are 12bytes + member size aligned to 8 bytes

      InputAttemptIdentifier -> 33Bytes gets aligned up to 40 bytes
      This class is just one byte over the 32 byte alignment. Reducing object size by one byte can save 8 bytes per object.
      This is ~8MB savings for 1,000,000 inputs and ~80 MB savings for tasks with 10,000,000 inputs to fetch (Yes this is a real job)

      MapOutput -> 45 bytes gets aligned to 48 bytes
      This class can be sub-classed to avoid all sub-classes paying the object size cost for the other sub-classes
      Wait InMemory and DiskDirect -> 32 bytes
      Disk -> 40 bytes
      Total savings is harder to account for but more than the above case.

      Attachments

        1. TEZ-3732.1.patch
          13 kB
          Jonathan Turner Eagles
        2. TEZ-3732.2.patch
          13 kB
          Jonathan Turner Eagles
        3. TEZ-3732.3.patch
          13 kB
          Jonathan Turner Eagles
        4. TEZ-3732.4.patch
          13 kB
          Jonathan Turner Eagles

        Activity

          People

            jeagles Jonathan Turner Eagles
            jeagles Jonathan Turner Eagles
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: