Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-7223

Quicksort GetInt performance Issue in Terasort

    XMLWordPrintableJSON

Details

    • Patch

    Description

      I find a hot spot of 'java.nio.Bits.getIntL' in the Terasort case of Hadoop.
      It is done by shifting four bytes in the byte array each time to get an int.
      This 'getIntL' operation is repeatedly called in the quick-sort of KVbuffer which has a complexity of NlogN, and it causes the hot spot.
      The element that is gotten in the quick-sort may be gotten frequently, which means it has to be shifted again and again.
      After replacing 'java.nio.Bits.getIntL' with 'unsafe.getInt', the performance of quick-sort can be improved by 30%。Terasort can be improved by 10%

       

      Quick-sort performance: The time of quick-sort using unsafe is 16515s,and using byteBuffer is 21643s.
                unsafe(s)   byteBuffer(s)   byteBuffer/unsafe
      AVG   16515           21643            1.310481735

      Attachments

        1. terasort.png
          7 kB
          WuZeyi
        2. makeint.png
          7 kB
          WuZeyi
        3. MAPREDUCE-7223-001.patch
          4 kB
          WuZeyi
        4. MAPREDUCE-7223-002.patch
          5 kB
          WuZeyi
        5. MAPREDUCE-7223-003.patch
          5 kB
          WuZeyi

        Activity

          People

            Zeyiii WuZeyi
            Zeyiii WuZeyi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: