Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.5.0
-
None
-
None
Description
pyspark.ml.functions.predict_batch_udf does not support return types with more than 2 dimensions:
https://github.com/apache/spark/pull/37734#discussion_r1016156053
Many computer vision models return ndarrays with 3 or 4 dimensions. Segmentation returns 3 dimensions: [Category, H, W]and if there is a time dimension, that's the fourth dimension.