Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17417

LazySimple Timestamp is very expensive

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 2.4.0, 3.0.0
    • 2.4.0, 3.0.0
    • None

    Description

      In a specific case where a schema contains array<struct> with timestamp and date fields (array size >10000). Any access to this column very very expensive in terms of CPU as most of the time is serialization of timestamp and date. Refer attached profiles. >70% time spent in serialization + tostring conversions.

      Attachments

        1. date-serialize.png
          347 kB
          Prasanth Jayachandran
        2. HIVE-17417.1.patch
          2 kB
          Prasanth Jayachandran
        3. HIVE-17417.2.patch
          3 kB
          Prasanth Jayachandran
        4. HIVE-17417.3.patch
          3 kB
          Prasanth Jayachandran
        5. HIVE-17417.4.patch
          2 kB
          Prasanth Jayachandran
        6. HIVE-17417.5.patch
          2 kB
          Prasanth Jayachandran
        7. HIVE-17417.6.patch
          2 kB
          Prasanth Jayachandran
        8. timestamp-serialize.png
          381 kB
          Prasanth Jayachandran
        9. ts-jmh-perf.png
          17 kB
          Prasanth Jayachandran

        Issue Links

          Activity

            People

              prasanth_j Prasanth Jayachandran
              prasanth_j Prasanth Jayachandran
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: