-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster with kubernetes backend constantly updates the pods metadata #463
Comments
@lwolf sorry but I'm missing something. It's normal that the keeper/proxy updates its state every few seconds and with the kubernetes store its state is saved in the pod metadata. |
Looks like I'm also missing something. I didn't look in the implementation of the new backend versus old ones, but I suppose the main difference is where the state is saved. I'm using |
@lwolf the k8s api don't have a TTL to expire keys like with etcd/consul so the faster way I found is to save the keeper and proxy state in their pod metadata so it''s automatically "expired" when the pod resource is removed. Another solution will be to expose an endpoint from the keeper/proxy to query the state directly from the process. It could be a future change (PRs are welcome). |
I see, thanks for the explanation. I really like to idea of kubernetes backend for stolon, but this state updates are stopping me from using it for now. Would be great if you could expand a little on what exactly should be done for the second approach. Like who should gather the state from that new endpoint inside keeper/proxy? |
The keepers and proxies will listen on a address:port and expose an http endpoint that when called will provide their state. The sentinel will query them and use this data instead of the data written to etcd/consul or the pod metadata. I haven't done this because using only the store doesn't require a second communication path. I understand your issue with watching pods changes but I'm not sure it's a bad annotation usage and the same happens when saving cluster data in the configmap annotation (and we can't do nothing here). A solution will be to improve kubectl to ignore reporting annotation changes. BTW using a dedicated etcd server instead of the k8s api will provide greater availability since it's not impacted by using the k8s api servers (shared between all the k8s components, could be down when updating k8s etc... : see the architecture doc). |
I agree that having separate etcd cluster is preferable for HA clusters. But I like the idea of using k8s as backend for cases when availability of stolon is not that critical. I had an idea that having the state in the Another possible solution could be to use CustomResources for the state. |
CRD aren't really suited for this king of things, they are for defining custom resources. In stolon we don't have custom resources but just one big thing that must be atomic called clusterdata and components that public their state. Using a CRD for components that public their states isn't really good and doesn't solve the problem to remove them when the pod exits (every proxy gets a different uid at every start). Just note that also the configmap used to save clusterdata is an hack. We don't use any configmap feature but save the clusterdata in a configmap annotation. We just use a configmap because it (with the endpoint) are the resources that in the k8s client already implements leader election. Another solution, to not use a "status" endpoint and not save component status in their pods, will be to use an additional resource, shared by all the components where they'll write their status (using different annotations for every component) and the sentinel will periodically clean up old annotations. If someone wants to try this I'm open for PRs. |
Another concern is that some controllers are listening for pod events (i.e. ingress controllers), and keeper/proxy updates cause some CPU (and logging) overhead. |
We've observed the same thing. There are other side effects which go beyond the one described here:
|
@sgotti , K8s has the lease API https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#lease-v1-coordination-k8s-io . Seemingly it was designed to replace the configmap based leader election. Perhaps it can used for an alternative implementation? |
@mindw This issue isn't related to the use of a configmap vs lease api. We are using a configmap to store the clusterdata and while we have it also for sentinel leader election, the lease api won't work since we also have to store the configdata and we'll need a configmap anyway. This issue is related to the fact that the keepers and pods write their status to their own pod metadata and this is reflected when one watches for pod changes. I personally think that this isn't an issue and should be fixed on the kubectl/kube api side by providing a way to filter updates types. But if you want you can implement something like the one I proposed at the end of this #463 (comment) |
Environment
kubernetes 1.9
Stolon version
master-pg9.6
0.10.0-pg10
Expected behaviour you didn't see
pods do not update after initialisation
Unexpected behaviour you saw
pods constantly getting updated.
https://gist.github.com/lwolf/2981232c2ccaa87e3d15681bcc425fe0
Diff of the same pod after few seconds shows that
infoUID
inmetadata.annotations.stolon-status
is constantly getting new value along withresourceVersion
Steps to reproduce the problem
deploy kubernetes example from this repository
The text was updated successfully, but these errors were encountered: