We are getting random NetworkExceptions
and TimeoutExceptions
in our production environment:
Brokers: 3
Zookeepers: 3
Servers: 3
Kafka: 0.10.0.1
Zookeeeper: 3.4.3
We are occasionally getting this exception in my producer logs:
Expiring 10 record(s) for TOPIC:XXXXXX: 5608 ms has passed since batch creation plus linger time.
Number of milliseconds in such error messages keep changing. Sometimes its ~5 seconds other times it‘s up to ~13 seconds!
And very rarely we get:
NetworkException: Server disconnected before response received.
Cluster consists of 3 brokers and 3 zookeepers. Producer server and Kafka cluster are in same network.
I am making synchronous calls. There‘s a web service to which multiple user requests call to send their data. Kafka web service has one Producer object which does all the sending. Producer‘s Request timeout was 1000ms initially that has been changed to 15000ms (15 seconds). Even after increasing timeout period TimeoutExceptions
are still showing up in error logs.
What can be the reason?