Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RabbitMQ inbound connector does not try to reconnect after connection error (RabbitMQ server was temporary down) #2828

Closed
igor-msk opened this issue Jul 9, 2024 · 7 comments · Fixed by #3281
Assignees
Labels
component:qa Task containing all details related to QA kind:enhancement New feature or request

Comments

@igor-msk
Copy link

igor-msk commented Jul 9, 2024

Is your feature request related to a problem? Please describe.
Our RabbitMQ server goes sometimes down. All other services automatically reactivate its subscribers after it’s up again, but Camunda Inbound RabbitMQ connector does not - it writes an error in it’s log and I have to manually restart connector pods to reactivate subscriptions to RabbitMQ.

Describe the solution you'd like
It should indefinetly retry to subscribe.

Describe alternatives you've considered
Manual restart of connectors pods

Additional context

2024-07-08T13:53:03.166Z ERROR 1 --- [pool-4-thread-3] i.c.c.rabbitmq.inbound.RabbitMqConsumer  : Consumer shutdown: sbf-it-cam-dockerwebservice-test-connectors-77f54cb85b-4mzsj

com.rabbitmq.client.ShutdownSignalException: connection error
        at com.rabbitmq.client.impl.AMQConnection.startShutdown(AMQConnection.java:1007)
        at com.rabbitmq.client.impl.AMQConnection.shutdown(AMQConnection.java:997)
        at com.rabbitmq.client.impl.AMQConnection.handleFailure(AMQConnection.java:797)
        at com.rabbitmq.client.impl.AMQConnection.access$500(AMQConnection.java:48)
        at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:696)
        at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.EOFException: null
        at java.base/java.io.DataInputStream.readUnsignedByte(Unknown Source)
        at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:91)
        at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:199)
        at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:687)
        ... 1 common frames omitted

2024-07-08T13:53:03.176Z  INFO 1 --- [pool-4-thread-3] i.c.c.r.inbound.RabbitMqExecutable       : Subscription deactivation requested by the Connector runtime
2024-07-08T13:53:03.177Z  WARN 1 --- [pool-4-thread-3] i.c.c.r.inbound.RabbitMqExecutable       : Failed to cancel consumer

com.rabbitmq.client.AlreadyClosedException: connection is already closed due to connection error; cause: java.io.EOFException
        at com.rabbitmq.client.impl.AMQChannel.ensureIsOpen(AMQChannel.java:281)
        at com.rabbitmq.client.impl.AMQChannel.rpc(AMQChannel.java:365)
        at com.rabbitmq.client.impl.ChannelN.basicCancel(ChannelN.java:1515)
        at com.rabbitmq.client.impl.recovery.AutorecoveringChannel.basicCancel(AutorecoveringChannel.java:650)
        at io.camunda.connector.rabbitmq.inbound.RabbitMqExecutable.deactivate(RabbitMqExecutable.java:69)
        at io.camunda.connector.runtime.inbound.lifecycle.InboundConnectorManager.deactivateConnector(InboundConnectorManager.java:192)
        at java.base/java.util.Optional.ifPresent(Unknown Source)
        at io.camunda.connector.runtime.inbound.lifecycle.InboundConnectorManager.deactivateConnector(InboundConnectorManager.java:185)
        at io.camunda.connector.runtime.inbound.lifecycle.InboundConnectorManager.lambda$activateConnector$3(InboundConnectorManager.java:128)
        at io.camunda.connector.runtime.core.inbound.InboundConnectorContextImpl.cancel(InboundConnectorContextImpl.java:113)
        at io.camunda.connector.rabbitmq.inbound.RabbitMqConsumer.handleShutdownSignal(RabbitMqConsumer.java:74)
        at com.rabbitmq.client.impl.ConsumerDispatcher.notifyConsumerOfShutdown(ConsumerDispatcher.java:197)
        at com.rabbitmq.client.impl.ConsumerDispatcher.notifyConsumersOfShutdown(ConsumerDispatcher.java:189)
        at com.rabbitmq.client.impl.ConsumerDispatcher.access$200(ConsumerDispatcher.java:36)
        at com.rabbitmq.client.impl.ConsumerDispatcher$6.run(ConsumerDispatcher.java:176)
        at com.rabbitmq.client.impl.ConsumerWorkService$WorkPoolRunnable.run(ConsumerWorkService.java:111)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.base/java.lang.Thread.run(Unknown Source)

2024-07-08T13:53:03.291Z ERROR 1 --- [pool-4-thread-3] i.c.c.r.i.l.InboundConnectorManager      : Failed to deactivate inbound connector ActiveInboundConnector[executable=io.camunda.connector.rabbitmq.inbound.RabbitMqExecutable@12fcf089, context=InboundConnectorContextImpl{definition=InboundConnectorDefinitionImpl{correlationPoint=StartEventCorrelationPoint[bpmnProcessId=BidFactoring1C, version=3, processDefinitionKey=2251799813689986], bpmnProcessId='BidFactoring1C', version=3, processDefinitionKey=2251799813689986, elementId='startEvent_messageFrom1C, tenantId='<default>'}}]

com.rabbitmq.client.AlreadyClosedException: connection is already closed due to connection error; cause: java.io.EOFException
        at com.rabbitmq.client.impl.AMQConnection.startShutdown(AMQConnection.java:1012)
        at com.rabbitmq.client.impl.AMQConnection.close(AMQConnection.java:1127)
        at com.rabbitmq.client.impl.AMQConnection.close(AMQConnection.java:1056)
        at com.rabbitmq.client.impl.AMQConnection.close(AMQConnection.java:1040)
        at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.close(AutorecoveringConnection.java:300)
        at io.camunda.connector.rabbitmq.inbound.RabbitMqExecutable.deactivate(RabbitMqExecutable.java:74)
        at io.camunda.connector.runtime.inbound.lifecycle.InboundConnectorManager.deactivateConnector(InboundConnectorManager.java:192)
        at java.base/java.util.Optional.ifPresent(Unknown Source)
        at io.camunda.connector.runtime.inbound.lifecycle.InboundConnectorManager.deactivateConnector(InboundConnectorManager.java:185)
        at io.camunda.connector.runtime.inbound.lifecycle.InboundConnectorManager.lambda$activateConnector$3(InboundConnectorManager.java:128)
        at io.camunda.connector.runtime.core.inbound.InboundConnectorContextImpl.cancel(InboundConnectorContextImpl.java:113)
        at io.camunda.connector.rabbitmq.inbound.RabbitMqConsumer.handleShutdownSignal(RabbitMqConsumer.java:74)
        at com.rabbitmq.client.impl.ConsumerDispatcher.notifyConsumerOfShutdown(ConsumerDispatcher.java:197)
        at com.rabbitmq.client.impl.ConsumerDispatcher.notifyConsumersOfShutdown(ConsumerDispatcher.java:189)
        at com.rabbitmq.client.impl.ConsumerDispatcher.access$200(ConsumerDispatcher.java:36)
        at com.rabbitmq.client.impl.ConsumerDispatcher$6.run(ConsumerDispatcher.java:176)
        at com.rabbitmq.client.impl.ConsumerWorkService$WorkPoolRunnable.run(ConsumerWorkService.java:111)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.base/java.lang.Thread.run(Unknown Source)
@igor-msk igor-msk added the kind:enhancement New feature or request label Jul 9, 2024
@sbuettner
Copy link
Contributor

@chillleader To quickly look whether this could be supported by the client library that is already in use.

@igor-msk
Copy link
Author

It's solved for kafka connector in version 8.5.5 now. How about implementing the same retries for RabbitMQ?

@sbuettner
Copy link
Contributor

@chillleader can we apply the same mechanism here as we did for the Kafka Connector?

@chillleader
Copy link
Member

Yes, we can do the same here 👍 I'll take a look

@igor-msk
Copy link
Author

any news on this topic?

@chillleader
Copy link
Member

Sorry for the delay - lots of topics on our plate currently 😅 I drafted the solution: #3281
I'll still need to run a few tests against the real server. Hopefully, this fix will make it into the 8.6 release (next month).

@chillleader chillleader added the component:qa Task containing all details related to QA label Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:qa Task containing all details related to QA kind:enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants