You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If the API server is not available, we log an error but perform calls (like mkdir) without added info.
Impact
It was really unclear to me why I wasn't seeing directories created with the ns name.
Environment and steps to reproduce
Set-up: Agent deployed to kubernetes as install docs say
Task: Create the example pod from install docs
Action(s): kubectl exec -ti bash and run mkdir /tmp/a in the pod
Error: If the agent can't contact the apiserver, the command will just create dir a instead of a-ns-something.
Expected behavior
I'd like to see a more clear message explaining what will happen when the API server can't be reached. Like, "API server down, will proceed with XXX".
This seems specially important to handle correctly when a pod is restarted (or killed if the kubelet restarts them). My gut feeling is that if we kubectl delete pod <pod>, kubelet will re-create it (if it is on a deployment) when the APIserver id down. At least that happens on my setup, but I think the kubelet can contact the apiserver here, just the agent is blocked due to the firewall. We should check on a real API server down. And IIUC the seccomp agent will query the apiserver in this restart case.
Assuming that happens, the following seems important to me:
Kubernetes is generally made to work when the API server is down. The kubelet should continue to work fine and restart pods already scheduled, etc. That is a big guarantee Kubernetes gives and we shouldn't break it when using the seccomp agent (in the future, it is ok at this stage :))
Additional information
I happen to hit this setup because I have a local firewall that blocks traffic and NetworkManager applies constantly. It was nice to narrow down, though, as I now have a better setup locally :)
The text was updated successfully, but these errors were encountered:
In the current code, the call to the API server happens for each new seccomp-fd but not for each syscall. So I think that even if the firewall is "fixed", currently running pods would still show the wrong behaviour.
But with my opa branch, I am planning to make new calls to the API server for getting custom resources in a similar way to Gatekeeper's constraints. That would be working as a cache though, so API server calls would not be performed during the syscall critical path, so it would not matter if the API server is down for a little while.
Description
If the API server is not available, we log an error but perform calls (like mkdir) without added info.
Impact
It was really unclear to me why I wasn't seeing directories created with the ns name.
Environment and steps to reproduce
run mkdir /tmp/a
in the poda
instead ofa-ns-something
.Expected behavior
I'd like to see a more clear message explaining what will happen when the API server can't be reached. Like, "API server down, will proceed with XXX".
This seems specially important to handle correctly when a pod is restarted (or killed if the kubelet restarts them). My gut feeling is that if we
kubectl delete pod <pod>
, kubelet will re-create it (if it is on a deployment) when the APIserver id down. At least that happens on my setup, but I think the kubelet can contact the apiserver here, just the agent is blocked due to the firewall. We should check on a real API server down. And IIUC the seccomp agent will query the apiserver in this restart case.Assuming that happens, the following seems important to me:
Kubernetes is generally made to work when the API server is down. The kubelet should continue to work fine and restart pods already scheduled, etc. That is a big guarantee Kubernetes gives and we shouldn't break it when using the seccomp agent (in the future, it is ok at this stage :))
Additional information
I happen to hit this setup because I have a local firewall that blocks traffic and NetworkManager applies constantly. It was nice to narrow down, though, as I now have a better setup locally :)
The text was updated successfully, but these errors were encountered: