Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29380

RFormula avoid repeated 'first' jobs to get vector size

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.0.0
    • 3.0.0
    • ML
    • None

    Description

      In current impl, RFormula will trigger one first job to get the vector size, if the size can not be obtained from AttributeGroup.

      This can be optimized by get the first row lazily, and reuse it for each vector column.

      Attachments

        Issue Links

          Activity

            People

              podongfeng Ruifeng Zheng
              podongfeng Ruifeng Zheng
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: