Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-11959

Python Beam SDK Harness hangs when installing pip packages

Details

    • Bug
    • Status: Open
    • P1
    • Resolution: Unresolved
    • 2.27.0, 2.28.0, 2.31.0, 2.32.0
    • None
    • None
    • Kubernetes v1.20.1
    • Important

    Description

      When running a Beam pipeline using Flink as backend, the python sdk harness hangs when trying to install pip packages. Tested using Flink 1.10.3.

      Images used: 

      apache/beam_python3.7_sdk:2.28.0

      apache/flink:1.10.3

      Beam args used are:

      "--runner=FlinkRunner",
      "–flink_version=1.10", //same with 1.13
      "--flink_master=http://flink-jobmanager.default:8081",
      f"--artifacts_dir=/mnt/flink",
      "--environment_type=EXTERNAL",
      "--environment_config=localhost:50000",

       

      Specifically this was tested by running a TFX pipeline which gets submitted and registered as it should, but the SDK Harness hangs when installing:

      2021/03/10 12:16:20 Initializing python harness: /opt/apache/beam/boot --id=1-1 --logging_endpoint=localhost:39795 --artifact_endpoint=localhost:34095 --provision_endpoint=localhost:42999 --control_endpoint=localhost:38129
      2021/03/10 12:16:20 Found artifact: tfx_ephemeral-0.27.0.tar.gz
      2021/03/10 12:16:20 Found artifact: extra_packages.txt
      2021/03/10 12:16:20 Installing setup packages ...
      2021/03/10 12:16:20 Installing extra package: tfx_ephemeral-0.27.0.tar.gz

      and nothing else is shown irregardless how long it is left. I can manually install the TFX package by exec into the container in < 3 min.

      The Flink task-manager then waits idling and periodically  logs:

      2021-03-10 11:29:26,287 INFO org.apache.beam.runners.fnexecution.environment.ExternalEnvironmentFactory - Still waiting for startup of environment from localhost:50000 for worker id 1-1

      Helm charts attached below.

      Attachments

        1. jobmanager-configmap.yaml
          1 kB
          Jens Wiren
        2. jobmanager-deploy.yaml
          1 kB
          Jens Wiren
        3. jobmanager-svc.yaml
          0.4 kB
          Jens Wiren
        4. taskmanager-deploy.yaml
          2 kB
          Jens Wiren

        Activity

          People

            Unassigned Unassigned
            ConverJens Jens Wiren
            Votes:
            2 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: