Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.0
-
None
-
None
Description
Persist on instances in ml.regression.IsotonicRegression.fit() is unnecessary, because it is only used once in run(instances).
override def fit(dataset: Dataset[_]): IsotonicRegressionModel = instrumented { instr => transformSchema(dataset.schema, logging = true) // Extract columns from data. If dataset is persisted, do not persist oldDataset. val instances = extractWeightedLabeledPoints(dataset) val handlePersistence = dataset.storageLevel == StorageLevel.NONE // Unnecessary persist if (handlePersistence) instances.persist(StorageLevel.MEMORY_AND_DISK) instr.logPipelineStage(this) instr.logDataset(dataset) instr.logParams(this, labelCol, featuresCol, weightCol, predictionCol, featureIndex, isotonic) instr.logNumFeatures(1) val isotonicRegression = new MLlibIsotonicRegression().setIsotonic($(isotonic)) val oldModel = isotonicRegression.run(instances) // Only use once here if (handlePersistence) instances.unpersist()
This issue is reported by our tool CacheCheck, which is used to dynamically detecting persist()/unpersist() api misuses.