Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
ASF CI
-
2
Description
Observed this on ASF CI:
[ RUN ] SlaveRecoveryTest/0.RegisterDisconnectedSlave I0111 18:52:55.691341 27071 cluster.cpp:160] Creating default 'local' authorizer I0111 18:52:55.693492 27076 master.cpp:383] Master f193ab26-63f6-4284-9de0-cd89bb73adb4 (b09aa0a37318) started on 172.17.0.3:42061 I0111 18:52:55.693522 27076 master.cpp:385] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocat or="HierarchicalDRF" --authenticate_agents="false" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http _readwrite="true" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/e83N9C/credentials" --framework_sorter="drf" --help="false" --hostname_lookup="true" --htt p_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_ag ent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --r egistry_fetch_timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --registry _strict="false" --root_submissions="true" --user_sorter="drf" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/e83N9C/master" --zk_session_timeout=" 10secs" I0111 18:52:55.693800 27076 master.cpp:435] Master only allowing authenticated frameworks to register I0111 18:52:55.693811 27076 master.cpp:451] Master allowing unauthenticated agents to register I0111 18:52:55.693819 27076 master.cpp:462] Master only allowing authenticated HTTP frameworks to register I0111 18:52:55.693825 27076 credentials.hpp:37] Loading credentials for authentication from '/tmp/e83N9C/credentials' I0111 18:52:55.693991 27076 master.cpp:507] Using default 'crammd5' authenticator I0111 18:52:55.694067 27076 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-readonly' I0111 18:52:55.694119 27076 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-readwrite' I0111 18:52:55.694142 27076 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-scheduler' I0111 18:52:55.694156 27076 master.cpp:587] Authorization enabled I0111 18:52:55.694834 27083 hierarchical.cpp:149] Initialized hierarchical allocator process I0111 18:52:55.694885 27083 whitelist_watcher.cpp:77] No whitelist given I0111 18:52:55.695083 27076 master.cpp:2119] Elected as the leading master! I0111 18:52:55.695096 27076 master.cpp:1641] Recovering from registrar I0111 18:52:55.695166 27076 registrar.cpp:329] Recovering registrar I0111 18:52:55.695739 27076 registrar.cpp:362] Successfully fetched the registry (0B) in 551936ns I0111 18:52:55.695775 27076 registrar.cpp:461] Applied 1 operations in 4972ns; attempting to update the registry I0111 18:52:55.696069 27076 registrar.cpp:506] Successfully updated the registry in 273920ns I0111 18:52:55.696121 27076 registrar.cpp:392] Successfully recovered registrar I0111 18:52:55.696553 27079 hierarchical.cpp:176] Skipping recovery of hierarchical allocator: nothing to recover I0111 18:52:55.696570 27073 master.cpp:1757] Recovered 0 agents from the registry (129B); allowing 10mins for agents to re-register I0111 18:52:55.698655 27071 containerizer.cpp:220] Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni W0111 18:52:55.699059 27071 backend.cpp:76] Failed to create 'aufs' backend: AufsBackend requires root privileges, but is running as user mesos W0111 18:52:55.699148 27071 backend.cpp:76] Failed to create 'bind' backend: BindBackend requires root privileges I0111 18:52:55.701257 27071 cluster.cpp:446] Creating default 'local' authorizer I0111 18:52:55.702049 27077 slave.cpp:209] Mesos agent started on (485)@172.17.0.3:42061 I0111 18:52:55.702069 27077 slave.cpp:210] Flags at startup: --acls="" --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_IX53a9/credential" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_IX53a9/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_command_executor="false" --http_credentials="/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_IX53a9/http_credentials" --http_heartbeat_interval="30secs" --image_provisioner_backend="copy" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --launcher="posix" --launcher_dir="/mesos/build/src" --logbufsecs="0" --logging_level="INFO" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;gpus:0;mem:1024;disk:1024;ports:[31000-32000]" --revocable_cpu_low_priority="true" --runtime_dir="/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_IX53a9" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_enable_support="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT" I0111 18:52:55.702533 27077 credentials.hpp:86] Loading credential for authentication from '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_IX53a9/credential' I0111 18:52:55.702687 27077 slave.cpp:352] Agent using credential for: test-principal I0111 18:52:55.702721 27077 credentials.hpp:37] Loading credentials for authentication from '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_IX53a9/http_credentials' I0111 18:52:55.702865 27077 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-agent-readonly' I0111 18:52:55.703058 27077 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-agent-readwrite' I0111 18:52:55.703541 27077 slave.cpp:539] Agent resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] I0111 18:52:55.703575 27077 slave.cpp:547] Agent attributes: [ ] I0111 18:52:55.703583 27077 slave.cpp:552] Agent hostname: b09aa0a37318 I0111 18:52:55.704030 27086 state.cpp:60] Recovering state from '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT/meta' I0111 18:52:55.704193 27086 status_update_manager.cpp:203] Recovering status update manager I0111 18:52:55.704581 27086 containerizer.cpp:599] Recovering containerizer I0111 18:52:55.705227 27080 provisioner.cpp:251] Provisioner recovery complete I0111 18:52:55.705379 27086 slave.cpp:5421] Finished recovery I0111 18:52:55.705912 27086 slave.cpp:5595] Querying resource estimator for oversubscribable resources I0111 18:52:55.706006 27086 slave.cpp:5609] Received oversubscribable resources {} from the resource estimator I0111 18:52:55.706210 27081 status_update_manager.cpp:177] Pausing sending status updates I0111 18:52:55.706233 27078 slave.cpp:924] New master detected at master@172.17.0.3:42061 I0111 18:52:55.706285 27078 slave.cpp:959] Detecting new master I0111 18:52:55.715513 27078 slave.cpp:986] Authenticating with master master@172.17.0.3:42061 I0111 18:52:55.715556 27078 slave.cpp:997] Using default CRAM-MD5 authenticatee I0111 18:52:55.715667 27077 authenticatee.cpp:121] Creating new client SASL connection I0111 18:52:55.715894 27081 master.cpp:6835] Authenticating slave(485)@172.17.0.3:42061 I0111 18:52:55.715944 27077 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(986)@172.17.0.3:42061 I0111 18:52:55.716037 27075 authenticator.cpp:98] Creating new server SASL connection I0111 18:52:55.716234 27080 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5 I0111 18:52:55.716277 27080 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5' I0111 18:52:55.716333 27080 authenticator.cpp:204] Received SASL authentication start I0111 18:52:55.716477 27080 authenticator.cpp:326] Authentication requires more steps I0111 18:52:55.716605 27080 authenticatee.cpp:259] Received SASL authentication step I0111 18:52:55.716790 27080 authenticator.cpp:232] Received SASL authentication step I0111 18:52:55.716907 27080 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'b09aa0a37318' server FQDN: 'b09aa0a37318' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false I0111 18:52:55.717041 27080 auxprop.cpp:181] Looking up auxiliary property '*userPassword' I0111 18:52:55.717185 27080 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5' I0111 18:52:55.717207 27080 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'b09aa0a37318' server FQDN: 'b09aa0a37318' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true I0111 18:52:55.717247 27080 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true I0111 18:52:55.717257 27080 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true I0111 18:52:55.717296 27080 authenticator.cpp:318] Authentication success I0111 18:52:55.717373 27080 authenticatee.cpp:299] Authentication success I0111 18:52:55.717404 27086 master.cpp:6865] Successfully authenticated principal 'test-principal' at slave(485)@172.17.0.3:42061 I0111 18:52:55.717432 27081 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(986)@172.17.0.3:42061 I0111 18:52:55.717664 27080 slave.cpp:1081] Successfully authenticated with master master@172.17.0.3:42061 I0111 18:52:55.717756 27080 slave.cpp:1503] Will retry registration in 17.356946ms if necessary I0111 18:52:55.717931 27081 master.cpp:5234] Registering agent at slave(485)@172.17.0.3:42061 (b09aa0a37318) with id f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 I0111 18:52:55.718101 27077 registrar.cpp:461] Applied 1 operations in 20803ns; attempting to update the registry I0111 18:52:55.718472 27076 registrar.cpp:506] Successfully updated the registry in 338176ns I0111 18:52:55.719049 27081 slave.cpp:4285] Received ping from slave-observer(462)@172.17.0.3:42061 I0111 18:52:55.719120 27081 slave.cpp:1127] Registered with master master@172.17.0.3:42061; given agent ID f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 I0111 18:52:55.719139 27081 fetcher.cpp:90] Clearing fetcher cache I0111 18:52:55.719128 27084 master.cpp:5305] Registered agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] I0111 18:52:55.719342 27084 hierarchical.cpp:491] Added agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 (b09aa0a37318) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (allocated: {}) I0111 18:52:55.719408 27084 hierarchical.cpp:1690] No allocations performed I0111 18:52:55.719430 27084 hierarchical.cpp:1315] Performed allocation for agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 in 61136ns I0111 18:52:55.719466 27081 slave.cpp:1155] Checkpointing SlaveInfo to '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT/meta/slaves/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0/slave.info' I0111 18:52:55.719480 27084 status_update_manager.cpp:184] Resuming sending status updates I0111 18:52:55.719810 27081 slave.cpp:1193] Forwarding total oversubscribed resources {} I0111 18:52:55.719862 27081 master.cpp:5712] Received update of agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) with total oversubscribed resources {} I0111 18:52:55.720095 27087 hierarchical.cpp:561] Agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 (b09aa0a37318) updated with oversubscribed resources {} (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: {}) I0111 18:52:55.720407 27087 hierarchical.cpp:1690] No allocations performed I0111 18:52:55.720433 27087 hierarchical.cpp:1315] Performed allocation for agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 in 161873ns I0111 18:52:55.721096 27071 sched.cpp:232] Version: 1.2.0 I0111 18:52:55.721359 27079 sched.cpp:336] New master detected at master@172.17.0.3:42061 I0111 18:52:55.721395 27079 sched.cpp:407] Authenticating with master master@172.17.0.3:42061 I0111 18:52:55.721410 27079 sched.cpp:414] Using default CRAM-MD5 authenticatee I0111 18:52:55.721606 27072 authenticatee.cpp:121] Creating new client SASL connection I0111 18:52:55.721803 27072 master.cpp:6835] Authenticating scheduler-491fec7d-d55e-48e3-9ca3-fce068ce2dce@172.17.0.3:42061 I0111 18:52:55.722048 27072 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(987)@172.17.0.3:42061 I0111 18:52:55.722317 27072 authenticator.cpp:98] Creating new server SASL connection I0111 18:52:55.722518 27072 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5 I0111 18:52:55.722548 27072 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5' I0111 18:52:55.722594 27072 authenticator.cpp:204] Received SASL authentication start I0111 18:52:55.722645 27072 authenticator.cpp:326] Authentication requires more steps I0111 18:52:55.722689 27072 authenticatee.cpp:259] Received SASL authentication step I0111 18:52:55.722745 27072 authenticator.cpp:232] Received SASL authentication step I0111 18:52:55.722771 27072 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'b09aa0a37318' server FQDN: 'b09aa0a37318' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false I0111 18:52:55.722784 27072 auxprop.cpp:181] Looking up auxiliary property '*userPassword' I0111 18:52:55.722797 27072 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5' I0111 18:52:55.722815 27072 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'b09aa0a37318' server FQDN: 'b09aa0a37318' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true I0111 18:52:55.722826 27072 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true I0111 18:52:55.722832 27072 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true I0111 18:52:55.722847 27072 authenticator.cpp:318] Authentication success I0111 18:52:55.722895 27072 authenticatee.cpp:299] Authentication success I0111 18:52:55.722937 27072 master.cpp:6865] Successfully authenticated principal 'test-principal' at scheduler-491fec7d-d55e-48e3-9ca3-fce068ce2dce@172.17.0.3:42061 I0111 18:52:55.722971 27072 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(987)@172.17.0.3:42061 I0111 18:52:55.723062 27072 sched.cpp:513] Successfully authenticated with master master@172.17.0.3:42061 I0111 18:52:55.723081 27072 sched.cpp:836] Sending SUBSCRIBE call to master@172.17.0.3:42061 I0111 18:52:55.723125 27072 sched.cpp:869] Will retry registration in 1.630125153secs if necessary I0111 18:52:55.723235 27072 master.cpp:2707] Received SUBSCRIBE call for framework 'default' at scheduler-491fec7d-d55e-48e3-9ca3-fce068ce2dce@172.17.0.3:42061 I0111 18:52:55.723264 27072 master.cpp:2155] Authorizing framework principal 'test-principal' to receive offers for role '*' I0111 18:52:55.723398 27072 master.cpp:2783] Subscribing framework default with checkpointing enabled and capabilities [ ] I0111 18:52:55.723603 27072 hierarchical.cpp:277] Added framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.723865 27072 hierarchical.cpp:1785] No inverse offers to send out! I0111 18:52:55.723897 27072 hierarchical.cpp:1292] Performed allocation for 1 agents in 273576ns I0111 18:52:55.724014 27072 sched.cpp:759] Framework registered with f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.724050 27072 sched.cpp:773] Scheduler::registered took 15750ns I0111 18:52:55.724215 27072 master.cpp:6664] Sending 1 offers to framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 (default) at scheduler-491fec7d-d55e-48e3-9ca3-fce068ce2dce@172.17.0.3:42061 I0111 18:52:55.724386 27072 sched.cpp:933] Scheduler::resourceOffers took 36299ns I0111 18:52:55.725075 27085 master.cpp:3662] Processing ACCEPT call for offers: [ f193ab26-63f6-4284-9de0-cd89bb73adb4-O0 ] on agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) for framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 (default) at scheduler-491fec7d-d55e-48e3-9ca3-fce068ce2dce@172.17.0.3:42061 I0111 18:52:55.725113 27085 master.cpp:3249] Authorizing framework principal 'test-principal' to launch task b969d109-e0fb-4308-9290-92b22ae8c4b7 I0111 18:52:55.725637 27085 master.cpp:8581] Adding task b969d109-e0fb-4308-9290-92b22ae8c4b7 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) I0111 18:52:55.725724 27085 master.cpp:4313] Launching task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 (default) at scheduler-491fec7d-d55e-48e3-9ca3-fce068ce2dce@172.17.0.3:42061 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) I0111 18:52:55.726084 27085 slave.cpp:1571] Got assigned task 'b969d109-e0fb-4308-9290-92b22ae8c4b7' for framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.726239 27085 slave.cpp:6262] Checkpointing FrameworkInfo to '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT/meta/slaves/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0/frameworks/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000/framework.info' I0111 18:52:55.726835 27085 slave.cpp:6273] Checkpointing framework pid 'scheduler-491fec7d-d55e-48e3-9ca3-fce068ce2dce@172.17.0.3:42061' to '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT/meta/slaves/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0/frameworks/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000/framework.pid' I0111 18:52:55.727361 27085 slave.cpp:1731] Launching task 'b969d109-e0fb-4308-9290-92b22ae8c4b7' for framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.727908 27085 paths.cpp:530] Trying to chown '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT/slaves/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0/frameworks/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000/executors/b969d109-e0fb-4308-9290-92b22ae8c4b7/runs/5ca261a9-7655-4473-ad23-4638c856c858' to user 'mesos' I0111 18:52:55.732523 27085 slave.cpp:6739] Checkpointing ExecutorInfo to '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT/meta/slaves/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0/frameworks/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000/executors/b969d109-e0fb-4308-9290-92b22ae8c4b7/executor.info' I0111 18:52:55.733202 27085 slave.cpp:6348] Launching executor 'b969d109-e0fb-4308-9290-92b22ae8c4b7' of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT/slaves/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0/frameworks/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000/executors/b969d109-e0fb-4308-9290-92b22ae8c4b7/runs/5ca261a9-7655-4473-ad23-4638c856c858' I0111 18:52:55.733749 27077 containerizer.cpp:991] Starting container 5ca261a9-7655-4473-ad23-4638c856c858 for executor 'b969d109-e0fb-4308-9290-92b22ae8c4b7' of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.733780 27085 slave.cpp:6762] Checkpointing TaskInfo to '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT/meta/slaves/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0/frameworks/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000/executors/b969d109-e0fb-4308-9290-92b22ae8c4b7/runs/5ca261a9-7655-4473-ad23-4638c856c858/tasks/b969d109-e0fb-4308-9290-92b22ae8c4b7/task.info' I0111 18:52:55.734252 27085 slave.cpp:2053] Queued task 'b969d109-e0fb-4308-9290-92b22ae8c4b7' for executor 'b969d109-e0fb-4308-9290-92b22ae8c4b7' of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.734329 27085 slave.cpp:877] Successfully attached file '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT/slaves/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0/frameworks/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000/executors/b969d109-e0fb-4308-9290-92b22ae8c4b7/runs/5ca261a9-7655-4473-ad23-4638c856c858' I0111 18:52:55.735308 27081 containerizer.cpp:1540] Launching 'mesos-containerizer' with flags '--help="false" --launch_info="{"command":{"arguments":["mesos-executor","--launcher_dir=\/mesos\/build\/src"],"shell":false,"value":"\/mesos\/build\/src\/mesos-executor"},"environment":{"variables":[{"name":"LIBPROCESS_PORT","value":"0"},{"name":"MESOS_AGENT_ENDPOINT","value":"172.17.0.3:42061"},{"name":"MESOS_CHECKPOINT","value":"1"},{"name":"MESOS_DIRECTORY","value":"\/tmp\/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT\/slaves\/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0\/frameworks\/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000\/executors\/b969d109-e0fb-4308-9290-92b22ae8c4b7\/runs\/5ca261a9-7655-4473-ad23-4638c856c858"},{"name":"MESOS_EXECUTOR_ID","value":"b969d109-e0fb-4308-9290-92b22ae8c4b7"},{"name":"MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD","value":"5secs"},{"name":"MESOS_FRAMEWORK_ID","value":"f193ab26-63f6-4284-9de0-cd89bb73adb4-0000"},{"name":"MESOS_HTTP_COMMAND_EXECUTOR","value":"0"},{"name":"MESOS_RECOVERY_TIMEOUT","value":"15mins"},{"name":"MESOS_SLAVE_ID","value":"f193ab26-63f6-4284-9de0-cd89bb73adb4-S0"},{"name":"MESOS_SLAVE_PID","value":"slave(485)@172.17.0.3:42061"},{"name":"MESOS_SUBSCRIPTION_BACKOFF_MAX","value":"2secs"},{"name":"MESOS_SANDBOX","value":"\/tmp\/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT\/slaves\/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0\/frameworks\/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000\/executors\/b969d109-e0fb-4308-9290-92b22ae8c4b7\/runs\/5ca261a9-7655-4473-ad23-4638c856c858"}]},"user":"mesos","working_directory":"\/tmp\/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT\/slaves\/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0\/frameworks\/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000\/executors\/b969d109-e0fb-4308-9290-92b22ae8c4b7\/runs\/5ca261a9-7655-4473-ad23-4638c856c858"}" --pipe_read="7" --pipe_write="8" --runtime_directory="/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_IX53a9/containers/5ca261a9-7655-4473-ad23-4638c856c858" --unshare_namespace_mnt="false"' I0111 18:52:55.736647 27081 launcher.cpp:133] Forked child with pid '30914' for container '5ca261a9-7655-4473-ad23-4638c856c858' I0111 18:52:55.736830 27081 containerizer.cpp:1639] Checkpointing container's forked pid 30914 to '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT/meta/slaves/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0/frameworks/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000/executors/b969d109-e0fb-4308-9290-92b22ae8c4b7/runs/5ca261a9-7655-4473-ad23-4638c856c858/pids/forked.pid' I0111 18:52:55.738760 27080 fetcher.cpp:349] Starting to fetch URIs for container: 5ca261a9-7655-4473-ad23-4638c856c858, directory: /tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT/slaves/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0/frameworks/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000/executors/b969d109-e0fb-4308-9290-92b22ae8c4b7/runs/5ca261a9-7655-4473-ad23-4638c856c858 I0111 18:52:55.828845 30921 exec.cpp:162] Version: 1.2.0 I0111 18:52:55.833731 27078 slave.cpp:3322] Got registration for executor 'b969d109-e0fb-4308-9290-92b22ae8c4b7' of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 from executor(1)@172.17.0.3:44874 I0111 18:52:55.833977 27078 slave.cpp:3408] Checkpointing executor pid 'executor(1)@172.17.0.3:44874' to '/tmp/SlaveRecoveryTest_0_RegisterDisconnectedSlave_XwgqaT/meta/slaves/f193ab26-63f6-4284-9de0-cd89bb73adb4-S0/frameworks/f193ab26-63f6-4284-9de0-cd89bb73adb4-0000/executors/b969d109-e0fb-4308-9290-92b22ae8c4b7/runs/5ca261a9-7655-4473-ad23-4638c856c858/pids/libprocess.pid' I0111 18:52:55.835109 30923 exec.cpp:237] Executor registered on agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 I0111 18:52:55.835211 27078 slave.cpp:2267] Sending queued task 'b969d109-e0fb-4308-9290-92b22ae8c4b7' to executor 'b969d109-e0fb-4308-9290-92b22ae8c4b7' of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 at executor(1)@172.17.0.3:44874 Received SUBSCRIBED event Subscribed executor on b09aa0a37318 Received LAUNCH event Starting task b969d109-e0fb-4308-9290-92b22ae8c4b7 /mesos/build/src/mesos-containerizer launch --help="false" --launch_info="{"command":{"shell":true,"value":"sleep 1000"}}" --unshare_namespace_mnt="false" Forked command at 30933 I0111 18:52:55.842718 27082 slave.cpp:3754] Handling status update TASK_RUNNING (UUID: 6fbee882-8439-4c01-91b5-49b7955349f8) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 from executor(1)@172.17.0.3:44874 I0111 18:52:55.843504 27073 status_update_manager.cpp:323] Received status update TASK_RUNNING (UUID: 6fbee882-8439-4c01-91b5-49b7955349f8) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.843534 27073 status_update_manager.cpp:500] Creating StatusUpdate stream for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.843977 27073 status_update_manager.cpp:832] Checkpointing UPDATE for status update TASK_RUNNING (UUID: 6fbee882-8439-4c01-91b5-49b7955349f8) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.844115 27073 status_update_manager.cpp:377] Forwarding update TASK_RUNNING (UUID: 6fbee882-8439-4c01-91b5-49b7955349f8) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 to the agent I0111 18:52:55.844358 27073 slave.cpp:4195] Forwarding the update TASK_RUNNING (UUID: 6fbee882-8439-4c01-91b5-49b7955349f8) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 to master@172.17.0.3:42061 I0111 18:52:55.844611 27073 slave.cpp:4089] Status update manager successfully handled status update TASK_RUNNING (UUID: 6fbee882-8439-4c01-91b5-49b7955349f8) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.844825 27073 slave.cpp:4105] Sending acknowledgement for status update TASK_RUNNING (UUID: 6fbee882-8439-4c01-91b5-49b7955349f8) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 to executor(1)@172.17.0.3:44874 I0111 18:52:55.844782 27086 master.cpp:5848] Status update TASK_RUNNING (UUID: 6fbee882-8439-4c01-91b5-49b7955349f8) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 from agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) I0111 18:52:55.844988 27086 master.cpp:5910] Forwarding status update TASK_RUNNING (UUID: 6fbee882-8439-4c01-91b5-49b7955349f8) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.845077 27086 master.cpp:7953] Updating the state of task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 (latest state: TASK_RUNNING, status update state: TASK_RUNNING) I0111 18:52:55.845196 27086 sched.cpp:1041] Scheduler::statusUpdate took 24432ns I0111 18:52:55.845324 27086 master.cpp:4950] Processing ACKNOWLEDGE call 6fbee882-8439-4c01-91b5-49b7955349f8 for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 (default) at scheduler-491fec7d-d55e-48e3-9ca3-fce068ce2dce@172.17.0.3:42061 on agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 I0111 18:52:55.845535 27086 status_update_manager.cpp:395] Received status update acknowledgement (UUID: 6fbee882-8439-4c01-91b5-49b7955349f8) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.845602 27086 status_update_manager.cpp:832] Checkpointing ACK for status update TASK_RUNNING (UUID: 6fbee882-8439-4c01-91b5-49b7955349f8) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.845835 27086 slave.cpp:3042] Status update manager successfully handled status update acknowledgement (UUID: 6fbee882-8439-4c01-91b5-49b7955349f8) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.846096 27075 containerizer.cpp:2118] Destroying container 5ca261a9-7655-4473-ad23-4638c856c858 in RUNNING state I0111 18:52:55.846189 27075 launcher.cpp:149] Asked to destroy container 5ca261a9-7655-4473-ad23-4638c856c858 I0111 18:52:55.853996 27075 slave.cpp:4327] Got exited event for executor(1)@172.17.0.3:44874 I0111 18:52:55.882633 27079 containerizer.cpp:2481] Container 5ca261a9-7655-4473-ad23-4638c856c858 has exited I0111 18:52:55.883587 27079 provisioner.cpp:322] Ignoring destroy request for unknown container 5ca261a9-7655-4473-ad23-4638c856c858 I0111 18:52:55.884263 27080 slave.cpp:4690] Executor 'b969d109-e0fb-4308-9290-92b22ae8c4b7' of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 terminated with signal Killed I0111 18:52:55.884340 27080 slave.cpp:3754] Handling status update TASK_FAILED (UUID: a641ba05-c0cc-4d51-9bc7-5d8dae71afce) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 from @0.0.0.0:0 W0111 18:52:55.884745 27083 containerizer.cpp:1933] Ignoring update for unknown container 5ca261a9-7655-4473-ad23-4638c856c858 I0111 18:52:55.885037 27074 status_update_manager.cpp:323] Received status update TASK_FAILED (UUID: a641ba05-c0cc-4d51-9bc7-5d8dae71afce) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.885092 27074 status_update_manager.cpp:832] Checkpointing UPDATE for status update TASK_FAILED (UUID: a641ba05-c0cc-4d51-9bc7-5d8dae71afce) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.885257 27074 status_update_manager.cpp:377] Forwarding update TASK_FAILED (UUID: a641ba05-c0cc-4d51-9bc7-5d8dae71afce) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 to the agent I0111 18:52:55.885406 27082 slave.cpp:4195] Forwarding the update TASK_FAILED (UUID: a641ba05-c0cc-4d51-9bc7-5d8dae71afce) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 to master@172.17.0.3:42061 I0111 18:52:55.885622 27082 slave.cpp:796] Agent terminating I0111 18:52:55.885812 27074 master.cpp:5848] Status update TASK_FAILED (UUID: a641ba05-c0cc-4d51-9bc7-5d8dae71afce) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 from agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) I0111 18:52:55.885972 27074 master.cpp:5910] Forwarding status update TASK_FAILED (UUID: a641ba05-c0cc-4d51-9bc7-5d8dae71afce) for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.886055 27074 master.cpp:7953] Updating the state of task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 (latest state: TASK_FAILED, status update state: TASK_FAILED) /mesos/src/tests/slave_recovery_tests.cpp:2799: Failure Mock function called more times than expected - returning directly. Function call: statusUpdate(0x7ffc45655d90, @0x2ac116c60810 120-byte object <50-6B 53-0D C1-2A 00-00 00-00 00-00 00-00 00-00 DF-13 00-00 00-00 00-00 00-55 00-34 C1-2A 00-00 70-23 00-34 C1-2A 00-00 03-00 00-00 01-00 00-00 D0-C1 73-02 00-00 00-00 B0-6D 01-34 C1-2A 00-00 30-4A 02-34 C1-2A 00-00 01-00 00-00 00-2A 00-00 D3-A0 D4-CE 87-2B D6-41 10-12 00-34 C1-2A 00-00 00-00 00-00 00-00 00-00 D0-ED 00-34 C1-2A 00-00 00-00 00-00 00-00 00-00>) Expected: to be called once Actual: called twice - over-saturated and active I0111 18:52:55.886711 27082 hierarchical.cpp:1024] Recovered cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: {}) on agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 from framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 I0111 18:52:55.886860 27074 sched.cpp:1041] Scheduler::statusUpdate took 505379ns I0111 18:52:55.886862 27085 master.cpp:1261] Agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) disconnected I0111 18:52:55.887459 27085 master.cpp:3051] Disconnecting agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) I0111 18:52:55.887506 27085 master.cpp:3070] Deactivating agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) W0111 18:52:55.887595 27085 master.cpp:4942] Cannot send status update acknowledgement a641ba05-c0cc-4d51-9bc7-5d8dae71afce for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 (default) at scheduler-491fec7d-d55e-48e3-9ca3-fce068ce2dce@172.17.0.3:42061 to agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) because agent is disconnected I0111 18:52:55.887657 27085 hierarchical.cpp:590] Agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 deactivated I0111 18:52:55.888365 27085 master.cpp:5195] Removing old disconnected agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) because a registration attempt occurred I0111 18:52:55.888388 27085 master.cpp:7736] Removing agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318): a new agent registered at the same address I0111 18:52:55.888535 27085 master.cpp:5234] Registering agent at slave(485)@172.17.0.3:42061 (b09aa0a37318) with id f193ab26-63f6-4284-9de0-cd89bb73adb4-S1 I0111 18:52:55.888734 27085 registrar.cpp:461] Applied 1 operations in 30248ns; attempting to update the registry I0111 18:52:55.889088 27085 registrar.cpp:506] Successfully updated the registry in 321024ns I0111 18:52:55.889161 27085 registrar.cpp:461] Applied 1 operations in 10441ns; attempting to update the registry I0111 18:52:55.889709 27085 registrar.cpp:506] Successfully updated the registry in 393984ns I0111 18:52:55.889269 27076 master.cpp:7778] Removed agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318): a new agent registered at the same address I0111 18:52:55.889883 27076 master.cpp:7953] Updating the state of task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 (latest state: TASK_FAILED, status update state: TASK_LOST) I0111 18:52:55.889982 27076 master.cpp:8049] Removing task b969d109-e0fb-4308-9290-92b22ae8c4b7 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 on agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 at slave(485)@172.17.0.3:42061 (b09aa0a37318) I0111 18:52:55.890242 27076 master.cpp:5905] Sending status update TASK_LOST for task b969d109-e0fb-4308-9290-92b22ae8c4b7 of framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 'Agent b09aa0a37318 removed: a new agent registered at the same address' I0111 18:52:55.890367 27076 master.cpp:2036] Notifying framework f193ab26-63f6-4284-9de0-cd89bb73adb4-0000 (default) at scheduler-491fec7d-d55e-48e3-9ca3-fce068ce2dce@172.17.0.3:42061 of lost agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S0 (b09aa0a37318) I0111 18:52:55.890631 27076 master.cpp:5305] Registered agent f193ab26-63f6-4284-9de0-cd89bb73adb4-S1 at slave(485)@172.17.0.3:42061 (b09aa0a37318) with cpus(*):2; mem(*):1024; :