Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47085

Preformance issue on thrift API

    XMLWordPrintableJSON

Details

    Description

      This new complexity was introduced in SPARK-39041.

      before this PR the code was:

      while (curRow < maxRows && iter.hasNext) {
        val sparkRow = iter.next()
        val row = ArrayBuffer[Any]()
        var curCol = 0
        while (curCol < sparkRow.length) {
          if (sparkRow.isNullAt(curCol)) {
            row += null
          } else {
            addNonNullColumnValue(sparkRow, row, curCol, timeFormatters)
          }
          curCol += 1
        }
        resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]])
        curRow += 1
      }

       foreach without the O(n^2) complexity so this change just return the state to what it was before.

       

      In class `RowSetUtils` there is a loop that has O(n^2) complexity:

      ...
       while (i < rowSize) {
                val row = rows(I)
                ...
      

      It can be easily converted back into O( n ) complexity.

       

       

      Attachments

        Issue Links

          Activity

            People

              igreenfi Izek Greenfield
              igreenfi Izek Greenfield
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: