Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26776

Reduce Py4J communication cost in PySpark's execution barrier check

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.4.0, 3.0.0
    • 3.0.0
    • PySpark
    • None

    Description

      I am investigating flaky tests. I realised that:

            File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/rdd.py", line 2512, in __init__
              self.is_barrier = prev._is_barrier() or isFromBarrier
            File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/rdd.py", line 2412, in _is_barrier
              return self._jrdd.rdd().isBarrier()
            File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1286, in __call__
              answer, self.gateway_client, self.target_id, self.name)
            File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/protocol.py", line 342, in get_return_value
              return OUTPUT_CONVERTER[type](answer[2:], gateway_client)
            File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 2492, in <lambda>
              lambda target_id, gateway_client: JavaObject(target_id, gateway_client))
            File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1324, in __init__
              ThreadSafeFinalizer.add_finalizer(key, value)
            File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/finalizer.py", line 43, in add_finalizer
              cls.finalizers[id] = weak_ref
            File "/usr/lib64/pypy-2.5.1/lib-python/2.7/threading.py", line 216, in __exit__
              self.release()
            File "/usr/lib64/pypy-2.5.1/lib-python/2.7/threading.py", line 208, in release
              self.__block.release()
          error: release unlocked lock
      

      I assume it might not be directly related with the test itself but I noticed that it prev._is_barrier() attempts to access via Py4J.

      Accessing via Py4J is expensive and IMHO it makes it flaky.

      Attachments

        Issue Links

          Activity

            People

              gurwls223 Hyukjin Kwon
              gurwls223 Hyukjin Kwon
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: