Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47369

Fix performance regression in JDK 17 caused from RocksDB logging

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3.0, 3.3.1, 3.3.3, 3.4.2, 3.3.2, 3.4.0, 3.4.1, 3.5.0, 3.5.1, 3.3.4
    • None
    • Structured Streaming
    • None

    Description

      JDK 17 has a performance regression in the JNI's AttachCurrentThread and DetachCurrentThread calls, as reported here: https://bugs.openjdk.org/browse/JDK-8314859. You can find a minimal reproduction of the JDK issue in that bug report. I have marked as affected versions 3.3.0^ since that is when JDK 17 started being offered in Spark.

      For context, every time RocksDB logs, it currently attaches itself to the JVM, invokes the RocksDB logging callback that we specify, and then detaches itself from the JVM. These attach/detach calls regressed, causing JDK 17 SS queries to run up to 10-15% slower than their respective JDK 8 queries.

      For example, a 100K record/second dropDuplicates had a p95 latency regression of 12%. A regression of 12% and 21% (at the p95) was observed for a query with 1M record/second, 100K keys, 10 second windows, and 0 second watermark.

      Because the Hotspot folks marked this as "Won't fix," one way to fix this is to avoid the JNI entirely and write the RocksDB to stderr. RocksDB 8.11.3 natively supports this (I implemented that feature in RocksJava). We can configure our RocksDB logger to do its logging this way.

      Attachments

        Activity

          People

            Unassigned Unassigned
            neilramaswamy Neil Ramaswamy
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: