Details
-
Bug
-
Status: Open
-
P1
-
Resolution: Unresolved
-
2.27.0, 2.28.0, 2.31.0, 2.32.0
-
None
-
None
-
Kubernetes v1.20.1
-
Important
Description
When running a Beam pipeline using Flink as backend, the python sdk harness hangs when trying to install pip packages. Tested using Flink 1.10.3.
Images used:
apache/beam_python3.7_sdk:2.28.0
apache/flink:1.10.3
Beam args used are:
"--runner=FlinkRunner",
"–flink_version=1.10", //same with 1.13
"--flink_master=http://flink-jobmanager.default:8081",
f"--artifacts_dir=/mnt/flink",
"--environment_type=EXTERNAL",
"--environment_config=localhost:50000",
Specifically this was tested by running a TFX pipeline which gets submitted and registered as it should, but the SDK Harness hangs when installing:
2021/03/10 12:16:20 Initializing python harness: /opt/apache/beam/boot --id=1-1 --logging_endpoint=localhost:39795 --artifact_endpoint=localhost:34095 --provision_endpoint=localhost:42999 --control_endpoint=localhost:38129
2021/03/10 12:16:20 Found artifact: tfx_ephemeral-0.27.0.tar.gz
2021/03/10 12:16:20 Found artifact: extra_packages.txt
2021/03/10 12:16:20 Installing setup packages ...
2021/03/10 12:16:20 Installing extra package: tfx_ephemeral-0.27.0.tar.gz
and nothing else is shown irregardless how long it is left. I can manually install the TFX package by exec into the container in < 3 min.
The Flink task-manager then waits idling and periodically logs:
2021-03-10 11:29:26,287 INFO org.apache.beam.runners.fnexecution.environment.ExternalEnvironmentFactory - Still waiting for startup of environment from localhost:50000 for worker id 1-1
Helm charts attached below.