Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Application application_1404246879802_0019 failed 50 times due to AM Container for appattempt_1404246879802_0019_000050 exited with exitCode: 0 due to: Exception from container-launch: java.io.IOException: Cannot run program "nice" (in directory "/export/content/data/samsa-yarn/usercache/samza-job/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"): error=7, Argument list too long java.io.IOException: Cannot run program "nice" (in directory "/export/content/data/samsa-yarn/usercache/samza/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"): error=7, Argument list too long at java.lang.ProcessBuilder.start(ProcessBuilder.java:1042) at org.apache.hadoop.util.Shell.runCommand(Shell.java:448) at org.apache.hadoop.util.Shell.run(Shell.java:418) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: error=7, Argument list too long at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.<init>(UNIXProcess.java:187) at java.lang.ProcessImpl.start(ProcessImpl.java:134) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1023) ... 10 more .Failing this attempt.. Failing the application.
This happens because the launch_container.sh script generated by yarn has all the export variables (including samza configs) and the run_container scripts, and when we export a big config variable it crashes the current shell it's running in.
For e.g., the size of the variable "SAMZA_SYSTEM_STREAMS" from launch_container config is:
bash-4.1$ sed '12q;d' launch_container.sh | wc -c
167546
As indicated here, http://www.in-ulm.de/~mascheck/various/argmax/
The maximum size of an argument is bound by MAX_ARG_STRLEN (131072).
This can be reproduced by exporting a large variable
[nsomasun@eat1-app201 usercache]$ sudo -uapp bash
bash-4.1$ export b1=A
bash-4.1$ export b2=$b1$b1
bash-4.1$ export b4=$b2$b2
bash-4.1$ export b8=$b4$b4
bash-4.1$ export b16=$b8$b8
bash-4.1$ export b32=$b16$b16
bash-4.1$ export b64=$b32$b32
bash-4.1$ export b128=$b64$b64
bash-4.1$ export b256=$b128$b128
bash-4.1$ export b512=$b256$b256
bash-4.1$ export b1k=$b512$b512
bash-4.1$ export b2k=$b1k$b1k
bash-4.1$ export b4k=$b2k$b2k
bash-4.1$ export b8k=$b4k$b4k
bash-4.1$ export b16k=$b8k$b8k
bash-4.1$ export b32k=$b16k$b16k
bash-4.1$ export b64k=$b32k$b32k
bash-4.1$ export b128k=$b64k$b64k
bash-4.1$ ls
bash: /bin/ls: Argument list too long
We need alternate mechanisms to pass configurations to the samza container, since we bound by the size of the variable the shell can support.