Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ALL_BROKERS_DOWN on Producer #1813

Open
1 of 7 tasks
dusanu87 opened this issue Sep 12, 2024 · 0 comments
Open
1 of 7 tasks

ALL_BROKERS_DOWN on Producer #1813

dusanu87 opened this issue Sep 12, 2024 · 0 comments

Comments

@dusanu87
Copy link

Description

We have spotted the following behavior after moving to version 2.5.0.
After regular AWS MSK maintenance where all brokers are restarted one by one, we can see in the logs following errors
cimpl.KafkaException: KafkaError{code=_ALL_BROKERS_DOWN,val=-187,str="3/3 brokers are down"}
and
cimpl.KafkaException: KafkaError{code=_TRANSPORT,val=-195,str="ssl://b-3.XXX.kafka.XXX.amazonaws.com:9094/3: Disconnected (after 1091448ms in state UP)"}
This is on the producer side which occasionally occurs even after 2-3 days after broker restart. This behavior is also present during Kubernetes deployment restart when flush is called on the producer side.
This issue is only present after the broker restart which happens during regular MSK maintenance.
The issue is also present in version 2.4.0. Before this version, we didn't encounter this behavior.
Once the application is restarted on K8s, the issue is gone.

How to reproduce

Restart one broker on the AWS MSK cluster(3 brokers)
Restart K8s deployment(application with producer side).

Checklist

Please provide the following information:

  • confluent-kafka-python and librdkafka version (confluent_kafka.version() and confluent_kafka.libversion()):
    (2.5.0) (2.5.0)
  • Apache Kafka broker version: (3.5.1)
  • Client configuration: { "queue.buffering.max.messages": settings.KAFKA_PRODUCER_QUEUE_COUNT, "queue.buffering.max.kbytes": settings.KAFKA_PRODUCER_QUEUE_BUFF_KBYTES, "linger.ms": settings.KAFKA_PRODUCER_LINGER, "bootstrap.servers": settings.KAFKA_BROKERS, "enable.idempotence": True, "acks": "all", "delivery.timeout.ms": settings.KAFKA_PRODUCER_DELIVERY_TIMEOUT_MS, "security.protocol": "SSL", "error_cb": error_cb, }
  • Operating system:
  • Provide client logs (with 'debug': '..' as necessary)
  • Provide broker log excerpts
  • Critical issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant