-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
version-checker seemingly leaks memory and gets oom-killed #76
Comments
We are seeing a similar behaviour while running Version Checker. Would be interested to know if there are recommended values for the limits? |
Also seeing something similar with Version Checker getting OOM killed fairly frequently. |
Hey @Trede1983 @trastle @roobre, Sorry its taken so long to get back to you on this issue... I have noted that there were some issues around version-checker since these issues have been raised in attempting to reduce the memory footprint. Things like this are extremely challenging to debug and replicate and it would be amazing to know how many nodes/pods you have in the cluster at the time of this issue, along with the memory/cpu limits/requests you had/have set. I appreciate that this may be some time ago, and that you may no longer be using version-checker, however this information could be really helpful for us to further understand the memory footprint in larger installations. In terms of tuning through and changes the main one that comes to mind is #160 along with the already mentioned #69
|
Hello @davidcollom I'm also encountering this issue. My test cluster is pretty small:
Flag Version checker is the latest (0.7.0) and installed using helm with the following values:
If I set |
I am running
version-checker
on a single node, quite small cluster with ~60 pods. So far it is working nicely, but I do not understand the memory behavior it has.I'm basically running the sample deployment file, plus the
--test-all-containers
flag and some cpu limits:kubectl get pod -o yaml
Over time, I see that
version-checker
approaches the memory limit and then stays near ~99% for a while. After some time, the kernel kills the ct due to OOM and k8s restarts the pod.However, I do not see anything alarming in the logs, other than some failures and expected permission errors.
This doesn't seem to have any functional impact, but does fire some alerts and doesn't look good on my dashboards :)
Is this behavior intended, and/or is there any way to prevent it?
The text was updated successfully, but these errors were encountered: