Details
-
Improvement
-
Status: Resolved
-
P2
-
Resolution: Fixed
-
None
Description
TestPubsub currently uses Beam's custom PubSub client, which occasionally leads to flaky failures due to requests that timed out or could be retried, e.g.:
1) failNotFound_matchingDCSchema_sendsMessages(com.google.cloud.dataflow.sqllauncher.PubsubWriteIT) io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: Deadline expired before operation could complete. at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:240) at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:221) at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:140) at com.google.pubsub.v1.SubscriberGrpc$SubscriberBlockingStub.createSubscription(SubscriberGrpc.java:1674) at org.apache.beam.sdk.io.gcp.pubsub.PubsubGrpcClient.createSubscription(PubsubGrpcClient.java:352) at org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.createRandomSubscription(PubsubClient.java:429) at org.apache.beam.sdk.io.gcp.pubsub.TestPubsub.initializePubsub(TestPubsub.java:123) at org.apache.beam.sdk.io.gcp.pubsub.TestPubsub.access$200(TestPubsub.java:57) at org.apache.beam.sdk.io.gcp.pubsub.TestPubsub$1.evaluate(TestPubsub.java:102) at org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:266) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
We should look into using the GCP client instead which should have better tuned timeouts and retry policies by default.