Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26199

Long expressions cause mutate to fail

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.2.0
    • 3.1.0, 3.2.0
    • SparkR
    • None

    Description

      Calling mutate(df, field = expr) fails when expr is very long.

      Example:

      df <- mutate(df, field = ifelse(
          lit(TRUE),
          lit("A"),
          ifelse(
              lit(T),
              lit("BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"),
              lit("C")
          )
      ))
      

      Stack trace:

      FATAL subscript out of bounds
        at .handleSimpleError(function (obj) 
      {
          level = sapply(class(obj), sw
        at FUN(X[[i]], ...)
        at lapply(seq_along(args), function(i) {
          if (ns[[i]] != "") {
      
      at lapply(seq_along(args), function(i) {
          if (ns[[i]] != "") {
      
      at mutate(df, field = ifelse(lit(TRUE), lit("A"), ifelse(lit(T), lit("BBB
        at #78: mutate(df, field = ifelse(lit(TRUE), lit("A"), ifelse(lit(T
      

      The root cause is in: DataFrame.R#LL2182

      When the expression is long deparse returns multiple lines, causing args to have more elements than ns. The solution could be to set nlines = 1 or to collapse the lines together.

      A simple work around exists, by first placing the expression in a variable and using it instead:

      tmp <- ifelse(
          lit(TRUE),
          lit("A"),
          ifelse(
              lit(T),
              lit("BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"),
              lit("C")
          )
      )
      df <- mutate(df, field = tmp)
      

      Attachments

        Issue Links

          Activity

            People

              michaelchirico Michael Chirico
              joao.rafael João Rafael
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: