Details
-
Bug
-
Status: Resolved
-
P2
-
Resolution: Won't Fix
-
2.14.0
Description
Currently, beam entry cost is > 30M:
-rw-r--r-- 1 rmannibucau rmannibucau 13M févr. 17 11:45 beam-vendor-grpc-1_13_1-0.2.jar -rw-r--r-- 1 rmannibucau rmannibucau 8,7M août 5 10:22 beam-sdks-java-core-2.14.0.jar -rw-r--r-- 1 rmannibucau rmannibucau 2,6M août 5 10:25 beam-vendor-sdks-java-extensions-protobuf-2.14.0.jar -rw-r--r-- 1 rmannibucau rmannibucau 2,6M févr. 17 11:45 beam-vendor-guava-20_0-0.1.jar -rw-r--r-- 1 rmannibucau rmannibucau 1,4M août 5 10:21 beam-model-pipeline-2.14.0.jar -rw-r--r-- 1 rmannibucau rmannibucau 825K août 5 10:25 beam-model-fn-execution-2.14.0.jar -rw-r--r-- 1 rmannibucau rmannibucau 470K août 5 10:21 beam-model-job-management-2.14.0.jar -rw-r--r-- 1 rmannibucau rmannibucau 446K août 5 10:25 beam-runners-core-construction-java-2.14.0.jar -rw-r--r-- 1 rmannibucau rmannibucau 378K août 5 10:24 beam-runners-core-java-2.14.0.jar
Due to its embed nature (generally sent with the job) it should stay as lightweight as possible. I see a few actions which can help to make back beam integrable:
- Make all the polyglotism layer optional and excludable, this is never needed for several jobs and this additional weight is a clear regression on the packaging side of beam,
- Vendoring and sdk dependencies are generally luxuray (who needs a library to do a new ArrayList<>() in 2019 ) so most of the dependencies can be dropped, vendoring can be made very lightweight - to not say optional for the sdk java core
At the end a reasonable limit for a runner like spark - not the direct one which reimplements all the logic by design - would be around 5M of deps IMHO.