Uploaded image for project: 'jclouds'
  1. jclouds
  2. JCLOUDS-1178

AWS vm deletion failed: did not retry on response "The service is unavailable. Please try again shortly"

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.9.2
    • None
    • jclouds-compute

    Description

      While using jclouds 1.9.2, while attempting to stop a VM in aws-ec2:us-east-1 I hit the error below:

      Failed after 39.1s: Error invoking stop at MySqlNodeImpl{id=zrxtil48g6}: AWSResponseException: request POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1 failed with code 503, error: AWSError{requestId='d4a38d1b-4f02-4161-b7ff-6cd78bdefe41', requestToken='null', code='Unavailable', message='The service is unavailable. Please try again shortly.', context='{Response=, Errors=}'}
      
      org.apache.brooklyn.core.mgmt.internal.EffectorUtils$EffectorCallPropagatedRuntimeException: Error invoking stop at MySqlNodeImpl{id=zrxtil48g6}: AWSResponseException: request POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1 failed with code 503, error: AWSError{requestId='d4a38d1b-4f02-4161-b7ff-6cd78bdefe41', requestToken='null', code='Unavailable', message='The service is unavailable. Please try again shortly.', context='{Response=, Errors=}'}
      	at org.apache.brooklyn.core.mgmt.internal.EffectorUtils$EffectorCallPropagatedRuntimeException.propagate(EffectorUtils.java:318)
      	at org.apache.brooklyn.core.mgmt.internal.EffectorUtils$EffectorCallPropagatedRuntimeException.access$100(EffectorUtils.java:285)
      	at org.apache.brooklyn.core.mgmt.internal.EffectorUtils.handleEffectorException(EffectorUtils.java:326)
      	at org.apache.brooklyn.core.effector.EffectorTasks$EffectorBodyTaskFactory$2.handleException(EffectorTasks.java:90)
      	at org.apache.brooklyn.util.core.task.DynamicSequentialTask.handleException(DynamicSequentialTask.java:469)
      	at org.apache.brooklyn.util.core.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:417)
      	at org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.brooklyn.util.exceptions.PropagatedRuntimeException: AWSResponseException: request POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1 failed with code 503, error: AWSError{requestId='d4a38d1b-4f02-4161-b7ff-6cd78bdefe41', requestToken='null', code='Unavailable', message='The service is unavailable. Please try again shortly.', context='{Response=, Errors=}'}
      	at org.apache.brooklyn.util.exceptions.Exceptions.propagate(Exceptions.java:129)
      	at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks.doStop(MachineLifecycleEffectorTasks.java:863)
      	at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks.stop(MachineLifecycleEffectorTasks.java:743)
      	at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks$StopEffectorBody.call(MachineLifecycleEffectorTasks.java:308)
      	at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks$StopEffectorBody.call(MachineLifecycleEffectorTasks.java:305)
      	at org.apache.brooklyn.core.effector.EffectorTasks$EffectorBodyTaskFactory$1.call(EffectorTasks.java:82)
      	at org.apache.brooklyn.util.core.task.DynamicSequentialTask$DstJob.call(DynamicSequentialTask.java:359)
      	... 5 more
      Caused by: java.util.concurrent.ExecutionException: org.jclouds.aws.AWSResponseException: request POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1 failed with code 503, error: AWSError{requestId='d4a38d1b-4f02-4161-b7ff-6cd78bdefe41', requestToken='null', code='Unavailable', message='The service is unavailable. Please try again shortly.', context='{Response=, Errors=}'}
      	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
      	at java.util.concurrent.FutureTask.get(FutureTask.java:188)
      	at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:63)
      	at org.apache.brooklyn.util.core.task.BasicTask.get(BasicTask.java:361)
      	at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks.doStop(MachineLifecycleEffectorTasks.java:851)
      	... 10 more
      Caused by: org.jclouds.aws.AWSResponseException: request POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1 failed with code 503, error: AWSError{requestId='d4a38d1b-4f02-4161-b7ff-6cd78bdefe41', requestToken='null', code='Unavailable', message='The service is unavailable. Please try again shortly.', context='{Response=, Errors=}'}
      	at org.jclouds.aws.handlers.ParseAWSErrorFromXmlContent.handleError(ParseAWSErrorFromXmlContent.java:75)
      	at org.jclouds.http.handlers.DelegatingErrorHandler.handleError(DelegatingErrorHandler.java:67)
      	at org.jclouds.http.internal.BaseHttpCommandExecutorService.shouldContinue(BaseHttpCommandExecutorService.java:136)
      	at org.jclouds.http.internal.BaseHttpCommandExecutorService.invoke(BaseHttpCommandExecutorService.java:105)
      	at org.jclouds.rest.internal.InvokeHttpMethod.invoke(InvokeHttpMethod.java:90)
      	at org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:73)
      	at org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:44)
      	at org.jclouds.reflect.FunctionalReflection$FunctionalInvocationHandler.handleInvocation(FunctionalReflection.java:117)
      	at com.google.common.reflect.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:87)
      	at com.sun.proxy.$Proxy160.describeInstancesInRegion(Unknown Source)
      	at org.jclouds.ec2.compute.strategy.EC2GetNodeMetadataStrategy.getRunningInstanceInRegion(EC2GetNodeMetadataStrategy.java:64)
      	at org.jclouds.aws.ec2.compute.strategy.AWSEC2GetNodeMetadataStrategy.getRunningInstanceInRegion(AWSEC2GetNodeMetadataStrategy.java:52)
      	at org.jclouds.ec2.compute.strategy.EC2GetNodeMetadataStrategy.getNode(EC2GetNodeMetadataStrategy.java:56)
      	at org.jclouds.compute.predicates.AtomicNodeTerminated.refreshOrNull(AtomicNodeTerminated.java:42)
      	at org.jclouds.compute.predicates.AtomicNodeTerminated.refreshOrNull(AtomicNodeTerminated.java:28)
      	at org.jclouds.compute.predicates.internal.TrueIfNullOrDeletedRefreshAndDoubleCheckOnFalse.apply(TrueIfNullOrDeletedRefreshAndDoubleCheckOnFalse.java:46)
      	at org.jclouds.compute.predicates.internal.TrueIfNullOrDeletedRefreshAndDoubleCheckOnFalse.apply(TrueIfNullOrDeletedRefreshAndDoubleCheckOnFalse.java:31)
      	at org.jclouds.util.Predicates2$RetryablePredicate.apply(Predicates2.java:117)
      	at org.jclouds.compute.internal.BaseComputeService.doDestroyNode(BaseComputeService.java:302)
      	at org.jclouds.compute.internal.BaseComputeService.destroyNode(BaseComputeService.java:243)
      	at org.apache.brooklyn.location.jclouds.JcloudsLocation.releaseNode(JcloudsLocation.java:2701)
      	at org.apache.brooklyn.location.jclouds.JcloudsLocation.release(JcloudsLocation.java:2651)
      	at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks.stopAnyProvisionedMachines(MachineLifecycleEffectorTasks.java:1021)
      	at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks$StopAnyProvisionedMachinesTask.call(MachineLifecycleEffectorTasks.java:875)
      	at org.apache.brooklyn.entity.software.base.lifecycle.MachineLifecycleEffectorTasks$StopAnyProvisionedMachinesTask.call(MachineLifecycleEffectorTasks.java:873)
      	... 6 more
      

      Digging into the jclouds wire log, I see:

      2016-09-13 21:00:09,811 DEBUG o.j.r.i.InvokeHttpMethod [brooklyn-execmanager-YxAEEeJK-3722]: >> invoking DescribeInstances
      2016-09-13 21:00:09,812 DEBUG o.j.h.i.JavaUrlHttpCommandExecutorService [brooklyn-execmanager-YxAEEeJK-3722]: Sending request -1247513718: POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1
      2016-09-13 21:00:09,812 DEBUG jclouds.wire [brooklyn-execmanager-YxAEEeJK-3722]: >> "Action=DescribeInstances&InstanceId.1=i-331bcea0&Version=2012-06-01"
      2016-09-13 21:00:09,812 DEBUG jclouds.headers [brooklyn-execmanager-YxAEEeJK-3722]: >> POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1
      2016-09-13 21:00:09,812 DEBUG jclouds.headers [brooklyn-execmanager-YxAEEeJK-3722]: >> Host: ec2.us-east-1.amazonaws.com
      2016-09-13 21:00:09,812 DEBUG jclouds.headers [brooklyn-execmanager-YxAEEeJK-3722]: >> X-Amz-Date: 20160913T200009Z
      2016-09-13 21:00:09,812 DEBUG jclouds.headers [brooklyn-execmanager-YxAEEeJK-3722]: >> Authorization: AWS4-HMAC-SHA256 Credential=AAAAAAAAAAAAAAAAAAAA/20160913/us-east-1/ec2/aws4_request, SignedHeaders=content-type;host;x-amz-date, Signature=84cf762b171d56299b096a106733e72bbf96d5654b4c607135cea8cd7be51689
      2016-09-13 21:00:09,812 DEBUG jclouds.headers [brooklyn-execmanager-YxAEEeJK-3722]: >> Content-Type: application/x-www-form-urlencoded
      2016-09-13 21:00:09,812 DEBUG jclouds.headers [brooklyn-execmanager-YxAEEeJK-3722]: >> Content-Length: 67
      2016-09-13 21:00:10,294 DEBUG o.j.h.i.JavaUrlHttpCommandExecutorService [brooklyn-execmanager-YxAEEeJK-3722]: Receiving response -1247513718: HTTP/1.1 503 Service Unavailable
      2016-09-13 21:00:10,294 DEBUG jclouds.headers [brooklyn-execmanager-YxAEEeJK-3722]: << HTTP/1.1 503 Service Unavailable
      2016-09-13 21:00:10,294 DEBUG jclouds.headers [brooklyn-execmanager-YxAEEeJK-3722]: << Date: Tue, 13 Sep 2016 20:00:10 GMT
      2016-09-13 21:00:10,294 DEBUG jclouds.headers [brooklyn-execmanager-YxAEEeJK-3722]: << Transfer-Encoding: chunked
      2016-09-13 21:00:10,294 DEBUG jclouds.headers [brooklyn-execmanager-YxAEEeJK-3722]: << Connection: close
      2016-09-13 21:00:10,294 DEBUG jclouds.headers [brooklyn-execmanager-YxAEEeJK-3722]: << Server: AmazonEC2
      2016-09-13 21:00:10,294 DEBUG jclouds.headers [brooklyn-execmanager-YxAEEeJK-3722]: << Content-Type: application/unknown
      2016-09-13 21:00:10,296 DEBUG jclouds.wire [brooklyn-execmanager-YxAEEeJK-3722]: << "<?xml version="1.0" encoding="UTF-8"?>[\n]"
      2016-09-13 21:00:10,296 DEBUG jclouds.wire [brooklyn-execmanager-YxAEEeJK-3722]: << "<Response><Errors><Error><Code>Unavailable</Code><Message>The service is unavailable. Please try again shortly.</Message></Error></Errors><RequestID>d4a38d1b-4f02-4161-b7ff-6cd78bdefe41</RequestID></Response>"
      

      In the jclouds code org.jclouds.aws.config.AWSHttpApiModule.provideRetryableServerCodes, I see that "Unavailable" is not a code that for which we'll retry (we'll only retry on server errors "RequestLimitExceeded" and "InternalError".
      (These codes are passed into AWSServerErrorRetryHandler.shouldRetryRequestOnError.)

      Looking at http://docs.aws.amazon.com/AWSEC2/latest/APIReference/errors-overview.html, I think we should also retry on:

      • ServiceUnavailable: The request has failed due to a temporary failure of the server.
      • Unavailable: The server is overloaded and can't handle the request.

      I also wonder for client errors whether we should retry on:

      • RequestExpired (given that we retry on SignatureDoesNotMatch - will we automatically reauthenticate?).
      • ConcurrentTagAccess: You can't run simultaneous commands to modify a tag for a specific resource. Allow sufficient wait time for the previous request to complete, then retry your request. For more information, see Error Retries and Exponential Backoff in AWS.

      Attachments

        Activity

          People

            Unassigned Unassigned
            aled.sage Aled Sage
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: