You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I started investigating this because my go-fuzz process was OOM-killed within 30 minutes two times in a row.
After spending a while puzzling over the heap profile results (before I found the MemProfileRate=0 assignment), I acquired a reasonable profile and a few logs of what happened when it started using lots of memory.
The process in question does crash a lot (a restart rate of over 1/50), which is obviously a big contributor to the problem, but I believe that go-fuzz should continue without using arbitrary memory even in that situation.
Here's a screenshot of the heap profile from one such run (unfortunately I lost the profile from that run), where over 2GB of memory is kept around in Worker.crasherQueue:
Although code inspection pointed towards crasherQueue as a possible culprit, I wasn't entirely sure that's what was happening until I reproduced the issue with a log statement added that showed the current size of the queue (including its associated data) whenever the queue slice is grown.
The final line that it printed before I dumped the heap profile was:
crasherQueue 0xc0000ca380 len 37171; space 465993686 (data 430941433; error 26019700; suppression 9032553)
That 466MB was 65% of the total current heap size of 713MB. In previous runs, I observed the total alloc size to rise to more than 8GB, although I wasn't able to obtain a heap profile at that time.
This problem does not always happen! It seems to depend very much on the current workload. It seems like it might be starvation problem, because only one of the worker queues grows in this way.
I started investigating this because my go-fuzz process was OOM-killed within 30 minutes two times in a row.
After spending a while puzzling over the heap profile results (before I found the
MemProfileRate=0
assignment), I acquired a reasonable profile and a few logs of what happened when it started using lots of memory.The process in question does crash a lot (a restart rate of over 1/50), which is obviously a big contributor to the problem, but I believe that go-fuzz should continue without using arbitrary memory even in that situation.
Here's a screenshot of the heap profile from one such run (unfortunately I lost the profile from that run), where over 2GB of memory is kept around in
Worker.crasherQueue
:Although code inspection pointed towards
crasherQueue
as a possible culprit, I wasn't entirely sure that's what was happening until I reproduced the issue with a log statement added that showed the current size of the queue (including its associated data) whenever the queue slice is grown.The final line that it printed before I dumped the heap profile was:
That 466MB was 65% of the total current heap size of 713MB. In previous runs, I observed the total alloc size to rise to more than 8GB, although I wasn't able to obtain a heap profile at that time.
This problem does not always happen! It seems to depend very much on the current workload. It seems like it might be starvation problem, because only one of the worker queues grows in this way.
Here's the whole log printed by that run: https://gist.github.com/rogpeppe/ad97d2c83834c24b0777a4009d71d120
The
crasherQueue
log lines were produced by this patch to theWorker.noteCrasher
method:The text was updated successfully, but these errors were encountered: