Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41395

InterpretedMutableProjection can corrupt unsafe buffer when used with decimal data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.1, 3.2.3, 3.4.0
    • 3.2.3, 3.3.2, 3.4.0
    • SQL
    • None

    Description

      The following returns the wrong answer:

      set spark.sql.codegen.wholeStage=false;
      set spark.sql.codegen.factoryMode=NO_CODEGEN;
      
      select max(col1), max(col2) from values
      (cast(null  as decimal(27,2)), cast(null   as decimal(27,2))),
      (cast(77.77 as decimal(27,2)), cast(245.00 as decimal(27,2)))
      as data(col1, col2);
      
      +---------+---------+
      |max(col1)|max(col2)|
      +---------+---------+
      |null     |239.88   |
      +---------+---------+
      

      This is because InterpretedMutableProjection inappropriately uses InternalRow#setNullAt to set null for decimal types with precision > Decimal.MAX_LONG_DIGITS.

      The path to corruption goes like this:

      Unsafe buffer at start:

                                                offset/len for   offset/len for
                                                1st decimal      2nd decimal
      
      offset: 0                8                16 (0x10)        24 (0x18)        32 (0x20)
      data:   0300000000000000 0000000018000000 0000000028000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      

      When processing the first incoming row ([null, null]), InterpretedMutableProjection calls setNullAt for the decimal types. As a result, the pointers to the storage areas for the two decimals in the variable length region get zeroed out.

      Buffer after projecting first row (null, null):

                                                offset/len for   offset/len for
                                                1st decimal      2nd decimal
      
      offset: 0                8                16 (0x10)        24 (0x18)        32 (0x20)
      data:   0300000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      

      When it's time to project the second row into the buffer, UnsafeRow#setDecimal uses the zero offsets, which causes UnsafeRow#setDecimal to overwrite the null-tracking bit set with decimal data:

              null-tracking
              bit area
      offset: 0                8                16 (0x10)        24 (0x18)        32 (0x20)
      data:   5db4000000000000 0000000000000000 0200000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
      

      The null-tracking bit set is overwritten with 239.88 (0x5db4) rather than 245.00 (0x5fb4) because setDecimal indirectly calls setNotNullAt(1), which turns off the null-tracking bit associated with the field at index 1.

      In addition, the decimal at field index 0 is now null because of the corruption of the null-tracking bit set.

      When a decimal type with precision > Decimal.MAX_LONG_DIGITS is null, InterpretedMutableProjection should write a null Decimal value rather than call setNullAt (see.)

      This bug could get exercised during codegen fallback. Take for example this case where I forced codegen to fail for the Greatest expression:

      spark-sql> select max(col1), max(col2) from values
      (cast(null  as decimal(27,2)), cast(null   as decimal(27,2))),
      (cast(77.77 as decimal(27,2)), cast(245.00 as decimal(27,2)))
      as data(col1, col2);
      
      22/12/05 08:18:54 ERROR CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 58, Column 1: ';' expected instead of 'if'
      org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 58, Column 1: ';' expected instead of 'if'
      	at org.codehaus.janino.TokenStreamImpl.compileException(TokenStreamImpl.java:362)
      	at org.codehaus.janino.TokenStreamImpl.read(TokenStreamImpl.java:149)
      	at org.codehaus.janino.Parser.read(Parser.java:3787)
      ...
      22/12/05 08:18:56 WARN MutableProjection: Expr codegen error and falling back to interpreter mode
      java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 43, Column 1: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 43, Column 1: ';' expected instead of 'boolean'
      	at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
      	at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)
      	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1583)
      	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1580)
      	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
      	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
      	... 36 more
      ...
      
      NULL	239.88   <== incorrect result, should be (77.77, 245.00)
      Time taken: 6.132 seconds, Fetched 1 row(s)
      spark-sql>
      

      Attachments

        Activity

          People

            bersprockets Bruce Robbins
            bersprockets Bruce Robbins
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: