You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We want to discard unexpected logs after multiple retries, e.g. msg size too large, unsupported msg format (permanent errors listed in https://github.com/fluent/fluent-plugin-kafka)
However, with multi-threaded fluentd buffers, it will just reset the retry counter once one of the threads succeeds to flush the buffer chunk. The failed buffer chunk will never reach its maxium retry time/numbers (2 hour/ 100 times), and thus it will keep retrying and never get discarded.
I tried to set the thread_cnt as 1, and assign more workers (process) to each node, but it loooks like the single thread will be blocked by the buffer chunk that keeps retrying, and won't process any other buffer chunks.
I am wondering if there is a solution to handle such multi-threaded retries. The ideal retry_timeout for my scenario is 2 hours. For now, the retry counter is usually reset after retrying 10 times in 15 min.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
We are collecting logs and sending logs to kafka with
rdkafka2
plugin.https://github.com/fluent/fluent-plugin-kafka
We want to discard unexpected logs after multiple retries, e.g. msg size too large, unsupported msg format (permanent errors listed in https://github.com/fluent/fluent-plugin-kafka)
However, with multi-threaded fluentd buffers, it will just reset the retry counter once one of the threads succeeds to flush the buffer chunk. The failed buffer chunk will never reach its maxium retry time/numbers (2 hour/ 100 times), and thus it will keep retrying and never get discarded.
The issue has been discussed here. #1854
I tried to set the
thread_cnt
as 1, and assign more workers (process) to each node, but it loooks like the single thread will be blocked by the buffer chunk that keeps retrying, and won't process any other buffer chunks.I am wondering if there is a solution to handle such multi-threaded retries. The ideal
retry_timeout
for my scenario is 2 hours. For now, the retry counter is usually reset after retrying 10 times in 15 min.Beta Was this translation helpful? Give feedback.
All reactions