Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Add reconciliation buffer to reduce load #857

Open
toonsevrin opened this issue Mar 14, 2020 · 30 comments
Open

Proposal: Add reconciliation buffer to reduce load #857

toonsevrin opened this issue Mar 14, 2020 · 30 comments
Labels
kind/design Categorizes issue or PR as related to design. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@toonsevrin
Copy link

Some of my controllers receive a bunch of reconciliation requests. However there's no improvement to the controller quality if I handle them all individually compared to handling them once every now and then.

I propose that we can optionally add a following controller option to these options:

type Options struct {
// MaxConcurrentReconciles is the maximum number of concurrent Reconciles which can be run. Defaults to 1.
MaxConcurrentReconciles int
// Reconciler reconciles an object
Reconciler reconcile.Reconciler
// RateLimiter is used to limit how frequently requests may be queued.
// Defaults to MaxOfRateLimiter which has both overall and per-item rate limiting.
// The overall is a token bucket and the per-item is exponential.
RateLimiter ratelimiter.RateLimiter
}

I'm definitely not an expert, but I think the option could have the same interface as the rate limiting interface, but instead of working on retries, working on all requests?

@vincepri
Copy link
Member

/priority awaiting-more-evidence

Can you clarify your use case a little bit more?

@k8s-ci-robot k8s-ci-robot added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Mar 26, 2020
@vincepri vincepri added this to the Next milestone Mar 26, 2020
@vincepri
Copy link
Member

/kind design

@k8s-ci-robot k8s-ci-robot added the kind/design Categorizes issue or PR as related to design. label Mar 26, 2020
@toonsevrin
Copy link
Author

/priority awaiting-more-evidence

Can you clarify your use case a little bit more?

This is generally useful when a CRD may be changed extremely often (eg multiple times per second) while you do not care about the reconciliation happening for each of these changes. Instead, you are fine with it happening once every now and then on the latest change.

@toonsevrin
Copy link
Author

My personal use case is a controller which listens to all secret, namespace and serviceaccount changes. The quality of the controller would however not really change if the reconciliation was done only once every, say, 10 seconds. Even if 100 events were watched in this period.

@vincepri
Copy link
Member

ah yes, that's what we initially understood when chatting at the community meeting. It seems an interesting use case indeed

cc @DirectXMan12

@toonsevrin
Copy link
Author

I think so too! Another use case is where there is assymetry: eg. When changing one resource (for example a deployment) results in the controller changing hundreds of other resources. Adding a buffer protects these controllers from being dossed

It becomes even more useful when you create a circular reconciliation loop by accident: Let's say controller A watches secret changes and updates the relevant deployments whenever a secret is changed. Controller B updates relevant secrets whenever a deployment is updated. All though this is horrible in general, a buffer on B would at least prevent this from completely blowing up :)

@DirectXMan12
Copy link
Contributor

aah, that makes sense. I think we'd have to heavily document this to avoid footguns, but this seems like a reasonable use case.

@toonsevrin can you write up a short design in a teensy bit more detail? Mainly interested in "how do we teach this to avoid confusion with the backoff rate limitter".

@DirectXMan12
Copy link
Contributor

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 19, 2020
@vincepri
Copy link
Member

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 20, 2020
@vincepri vincepri removed the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Aug 5, 2020
@vincepri
Copy link
Member

vincepri commented Aug 5, 2020

/priority important-longterm

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Aug 5, 2020
@alvaroaleman
Copy link
Member

FWIW, we've build something like this as a Reconciler wrapper downstream: https://github.com/openshift/ci-tools/blob/e847c7d352263b7415732d4309c041d3afb8ab71/pkg/controller/util/request_coalescer.go#L13

@negz
Copy link
Contributor

negz commented Sep 23, 2021

Just came here with what appears to be another use case for this (after being surprised that the RateLimiter added per #631 only rate limits requeues, not queues). In my case there's a strong correlation between reconciles and API calls that may cost money, so we'd like to offer users an option to rate limit reconciles it seems that we can do this with something like @alvaroaleman's approach, but baked in support would be nice.

@negz
Copy link
Contributor

negz commented Sep 24, 2021

Er, just ended up back here re-reading today and realised what I want isn't quite the same as what this issue is about but it still seems like a 'middleware' Reconciler (and/or a custom event Source) could do the trick.

@CecileRobertMichon
Copy link

Here is what we did in cluster-api-provider-azure to solve this: devigned/cluster-api-provider-azure@b6b38b0

It would be great to include something like it in controller-runtime so all projects can use it.

cc @devigned

@damdo
Copy link
Member

damdo commented Dec 14, 2022

This would really be a useful addition for certain use cases, also considering this is something that various project are implementing as a wrapper.

Would this be something we are still interested in seeing upstreamed here in controller-runtime?

@JoelSpeed
Copy link
Contributor

What's the best way to progress this issue? Should we discuss this on a community call? Is the design template still used within this community? Should we just copy one of the existing implementations (I reviewed both, the CAPZ one looks preferable IMO) into a PR and see how we go?

@alvaroaleman
Copy link
Member

alvaroaleman commented Dec 15, 2022

I meant to respond here, sorry. So a couple of points to consider: First off, how much demand for this actually exists? And will we be able to find a solution for this that will make everyone who commented in this issue happy?

If this turns out to be a more niche problem where we can not easily find a solution that works well for everyone, I'd prefer not to add anything in c-r for it. Controller-Runtime is intentionally factored in a way where you can solve problems like this downstream (correct me if you see reasons why that wouldn't be possible here).

@JoelSpeed
Copy link
Contributor

First off, how much demand for this actually exists?

In terms of the demand, I think we have a few folks who've certainly hit issues like this, so there's some want from the community at the minimum. I can think of a few cases where issues might be solved by something in this space:

  • Where you get a lot of events that you map into a single object (eg a one to many relationship) where you don't need to necessarily reconcile the parent object many times in a short period
  • Where you are creating objects that you watch, we've seen examples where we "double create" because the second reconcile of the object happens so quickly the API isn't returning the newly created object (even with an uncached client)
  • Where you are watching objects that have a high event count, perhaps nodes or pods, and you only need to react every so often (the cluster autoscaler feels like a case here, they reconcile periodically because of the overwhelming number of events they would otherwise receive, perhaps with coalescing they could be more efficient?)
  • Where you have an external system with a rate limit, if you know each reconcile costs you x of your y quota per hour, you could use request coalescing to rate limit your reconciler

I appreciate some of these cases may be solved in other ways

If this turns out to be a more niche problem where we can not easily find a solution that works well for everyone, I'd prefer not to add anything in c-r for it. Controller-Runtime is intentionally factored in a way where you can solve problems like this downstream (correct me if you see reasons why that wouldn't be possible here).

The proposal of a middleware would mean that this wouldn't really affect the existing scope of the manager or reconciler interfaces, but instead just provide a utility that folks could opt into using if they decided they needed it. I think that fits the theme of people solving the issue downstream, but instead provides a common utility since it's a common issue, but I think this warrants further discussion about what we do and don't want to support within CR. I think it would be good to discuss this at a community call in the new year

@vincepri
Copy link
Member

Where you are creating objects that you watch, we've seen examples where we "double create" because the second reconcile of the object happens so quickly the API isn't returning the newly created object (even with an uncached client)

This should probably be fixed by using a live client, what do you think? There is some opportunity for Controller Runtime built-in client to expose generally safer mechanics around GetOrCreate/Patch/etc

@vincepri
Copy link
Member

The use cases above make sense to me, we could explore some different options where:

  • We have a stepping-stone implementation of sorts where a coalescing function is marked as alpha (w/ feature gates of sorts)
  • We should be clear in documentation and surrounding code comments that the first implementation is purely alpha.
  • The first step could be a simple coalescing function, which specifies the time window to coalesce by GVK

thoughts?

@alvaroaleman
Copy link
Member

Where you are creating objects that you watch, we've seen examples where we "double create" because the second reconcile of the object happens so quickly the API isn't returning the newly created object (even with an uncached client)

That is an unrelated issue and just waiting is not a great approach to solve it. A better one is to use deterministic names so that the second create will fail if the object already exists.

The first step could be a simple coalescing function, which specifies the time window to coalesce by GVK

Sounds fine to me. Can we outline to external API and the expected behavior here though, to make sure ppl can check if works for them?

@JoelSpeed
Copy link
Contributor

This should probably be fixed by using a live client, what do you think? There is some opportunity for Controller Runtime built-in client to expose generally safer mechanics around GetOrCreate/Patch/etc

So yeah, that's what I thought, we put a patch in to "double check" the result just before the create to make sure 100% using a live/uncached client, and we are still able to reproduce the bug, which is super weird and I'm pretty confused about, but it's been reproduced several times even with the uncached client in place, each time, a super hot reconcile, sub 10ms apart has been seen

That is an unrelated issue and just waiting is not a great approach to solve it. A better one is to use deterministic names so that the second create will fail if the object already exists.

That's not always possible, and not possible in the particular case I've been working on recently. Think about a controller like the replicaset controller, it maintains a number of pods, if it were observing this bug, it would create too many pods, but it can't use deterministic names. Our use case is similar, we have no option to use deterministic names, and I bet there are others out there that have no choice either (the cluster-api project for example, where creating machines can't use deterministic names)

@alvaroaleman
Copy link
Member

That's not always possible, and not possible in the particular case I've been working on recently. Think about a controller like the replicaset controller, it maintains a number of pods, if it were observing this bug, it would create too many pods, but it can't use deterministic names.

That sounds like a good use-case for expectations just like the ReplicaSet controller uses them. The basic idea is to track your create/delete actions and only sync if your cache observed all of them.. This is something controller-runtime doesn't help you with today, but there is another issue for that.

@justinsb
Copy link
Contributor

I encountered something simlar. The challenge that I see with the approaches that involve calling RequeueAfter is that I believe that RequeueAfter does not combine results by key, so we still end up with the same number of reconciliations, just spread out over more time.

My reasoning:

When we call RequeueAfter, that results in a call to AddAfter on DelayingInterface, which lands at https://github.com/kubernetes/client-go/blob/1517ffb8d37c99e6a3a2842bcdee0aa271f0332b/util/workqueue/delaying_queue.go#L287 . That is a simple list, not a map by types.NamespacedName.

Maybe I am missing something though?

I'm wondering if we need a better Queue implementation, one that is able to coalesce the same requests in the queue and/or debounce. AFAICT there's no way to set Queue/MakeQueue though - maybe a good first step would be to expose that so that controllers could prove that this is a win.

So maybe a +1 on #857 (comment) , and specifically a request to let controllers experiment with replacing the Queue.

@sbueringer
Copy link
Member

cc @fabriziopandini @ykakarap @killianmuldoon
(potentially interesting for ClusterAPI)

@alvaroaleman
Copy link
Member

I'm wondering if we need a better Queue implementation, one that is able to coalesce the same requests in the queue and/or debounce. AFAICT there's no way to set Queue/MakeQueue though - maybe a good first step would be to expose that so that controllers could prove that this is a win.

Adding an option to replace the queue seems ok to me.

What exactly do you mean by coalesce the same requests in the queue - It should already de-duplicate identical items. Is what what you mean? Or are you referring to the AddAfter behavior? If so, IMHO a case can be made both for de-duplicating and not de-duplicating, thus it is unlikely the current behavior will be changed.

Debouncing should already be possible using RequeueAfter (admittedly not super elegant) or am I missing something? I linked something that does that in #857 (comment)

@fabriziopandini
Copy link
Member

I don't know if feasible, but it will be ideal to have something like a priority queue, with resync events having a lower priority and other events having a higher priority.
This could help when there are many objects of the same type and at every resync period there is a storm of events being added to the queue

@alvaroaleman
Copy link
Member

@fabriziopandini agree about priority queue, we also have some use cases where that would be useful. However, that issue is IMHO orthogonal to this and has some added challenges like the fact that there is currently no upstream priority queue implementation. Would you mind opening a dedicated issue for a priority queue, please?

@shashankram
Copy link

@alvaroaleman what kind of debouncing is supported? I see an issue where my controller watches an Object that it creates, that results in an Update resulting in 2 Update events: 1 from the initial change to the watched object, and a subsequent update event from the controller updating the Object in reaction to the previous update event. How can this be coalesced into a single event?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design Categorizes issue or PR as related to design. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests