Details
-
Bug
-
Status: Open
-
P1
-
Resolution: Unresolved
-
2.27.0, 2.28.0, 2.29.0, 2.30.0, 2.31.0
-
None
-
Kubernetes 1.20 on Ubuntu 18.04.
-
Important
Description
I'm running TFX pipelines on a Flink cluster using Beam in k8s. However, extra python packages passed to the Flink runner (or rather beam worker side-car) are only installed once per deployment cycle. Example:
- Flink is deployed and is up and running
- A TFX pipeline starts, submits a job to Flink along with a python whl of custom code and beam ops.
- The beam worker installs the package and the pipeline finishes succesfully.
- A new TFX pipeline is build where a new beam fn is introduced, the pipline is started and the new whl is submitted as in step 2).
- This time, the new package is not being installed in the beam worker causing the job to fail due to a reference which does not exist in the beam worker, since it didn't install the new package.
I started using Flink from beam version 2.27 and it has been an issue all the time.
Attachments
Issue Links
- is related to
-
BEAM-13669 Install Python wheel and dependencies to local venv in SDK harness
- Triage Needed
- links to