The maxPendingAddRequestsPerThread configuration is inconsistent with the actual behavior, resulting in netty direct memory OOM #4465

AuroraTwinkle · 2024-07-18T09:48:58Z

BUG REPORT

Describe the bug

The maxPendingAddRequestsPerThread configuration in bookkeeper.conf is inconsistent with the actual behavior, resulting in netty direct memory OOM.

To Reproduce

Steps to reproduce the behavior:

set config: maxPendingAddRequestsPerThread=1000

simulate rocksdb failure, flush is blocked for a long time
addEntry requests are queued in the thread pool, and the size of queues is twice the maxPendingAddRequestsPerThread configuration
the size of queued requests doubled, resulting in netty direct memory OOM

Expected behavior

If we configure maxPendingAddRequestsPerThread to 1000, the maximum number of thread pool queues should not exceed 1000, otherwise it will cause unexpected memory usage and may cause OOM.

By reading the source code, I found that the root cause of the problem is that there is a localTasks logic in SingleThreadExecutor. Before each execution, it tries to drain all tasks from the thread pool queue to localTasks. At this time, the thread pool queue is equivalent to being cleared, but the tasks in localTasks have not yet been executed. In extreme cases, the number of tasks queued by each SingleThreadExecutor will double. I guess the purpose of using localTasks is to avoid lock contention and improve performance, which is fine. But perhaps we should consider solving the problem of doubling the number of queued tasks.

Screenshots

Additional context

Add any other context about the problem here.

AuroraTwinkle · 2024-07-18T09:51:09Z

@merlimat Hi, please pay attention to this issue, thanks!

nodece · 2024-08-23T17:41:08Z

It seems that we need to add an atomic integer to record the working task.

nodece · 2024-08-24T18:00:57Z

I submitted a PR to fix this issue: #4488

AuroraTwinkle added the type/bug label Jul 18, 2024

StevenLuMT assigned merlimat and eolivelli Jul 18, 2024

nodece linked a pull request Aug 24, 2024 that will close this issue

fix: throw reject when SingleThreadExecutor drainTo in progress and queue is empty #4488

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The maxPendingAddRequestsPerThread configuration is inconsistent with the actual behavior, resulting in netty direct memory OOM #4465

The maxPendingAddRequestsPerThread configuration is inconsistent with the actual behavior, resulting in netty direct memory OOM #4465

AuroraTwinkle commented Jul 18, 2024

AuroraTwinkle commented Jul 18, 2024

nodece commented Aug 23, 2024

nodece commented Aug 24, 2024

The maxPendingAddRequestsPerThread configuration is inconsistent with the actual behavior, resulting in netty direct memory OOM #4465

The maxPendingAddRequestsPerThread configuration is inconsistent with the actual behavior, resulting in netty direct memory OOM #4465

Comments

AuroraTwinkle commented Jul 18, 2024

AuroraTwinkle commented Jul 18, 2024

nodece commented Aug 23, 2024

nodece commented Aug 24, 2024