Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28365

Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0
    • Fix Version/s: 3.0.0
    • Component/s: ML
    • Labels:
      None

      Description

      Because the local default locale isn't in available locales at Locale, when I did some tests locally with python code, StopWordsRemover related python test hits some errors, like:

      Traceback (most recent call last):
        File "/spark-1/python/pyspark/ml/tests/test_feature.py", line 87, in test_stopwordsremover
          stopWordRemover = StopWordsRemover(inputCol="input", outputCol="output")
        File "/spark-1/python/pyspark/__init__.py", line 111, in wrapper
          return func(self, **kwargs)
        File "/spark-1/python/pyspark/ml/feature.py", line 2646, in __init__
          self.uid)
        File "/spark-1/python/pyspark/ml/wrapper.py", line 67, in _new_java_obj
          return java_obj(*java_args)
        File /spark-1/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1554, in __call__
          answer, self._gateway_client, None, self._fqn)
        File "/spark-1/python/pyspark/sql/utils.py", line 93, in deco
          raise converted
      pyspark.sql.utils.IllegalArgumentException: 'StopWordsRemover_4598673ee802 parameter locale given invalid value en_TW.'
      

      As per Hyukjin Kwon's advice, instead of setting up locale to pass test, it is better to have a workable locale if system default locale can't be found in available locales in JVM. Otherwise, users have to manually change system locale or accessing a private property _jvm in PySpark.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                viirya Liang-Chi Hsieh
                Reporter:
                viirya Liang-Chi Hsieh
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: