Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaling information #5447

Closed
MrAmbiG opened this issue Dec 10, 2021 · 4 comments
Closed

Scaling information #5447

MrAmbiG opened this issue Dec 10, 2021 · 4 comments
Assignees
Labels
kind/documentation Categorizes issue or PR as related to documentation. language/ansible Issue is related to an Ansible operator project priority/backlog Higher priority than priority/awaiting-more-evidence. triage/support Indicates an issue that is a support question.
Milestone

Comments

@MrAmbiG
Copy link

MrAmbiG commented Dec 10, 2021

What is the URL of the document?

Which section(s) is the issue in?

Scale

What needs fixing?

Additional context

Following details are missing

  1. Does the operator do auto scaling of the awx ?
  2. If operator does not do auto scaling (hpa or vpa) then does the operator allow us to apply a separate hpa.yaml for the awx?
  3. If the operator does support auto scaling then what are the parameters (cpu, memory, requests, io etc.,) and their threshold (80%, 10K requests etc.,)?
  4. If auto scaling is not in built and also an external hpa configuration is not supported then are there any plans to introduce auto scaling?
@MrAmbiG MrAmbiG added the kind/documentation Categorizes issue or PR as related to documentation. label Dec 10, 2021
@camilamacedo86
Copy link
Contributor

c/c @fgiloux

@fgiloux
Copy link
Contributor

fgiloux commented Dec 11, 2021

Hi @MrAmbiG @camilamacedo86

Thanks Camila for adding me to this issue. I am not much knowledgeable about the Ansible Operator but I don't think that it behaves differently from a Go Operator in this respect.

  1. Does the operator do auto scaling of the awx ?

The operator does not auto-scale on its own

  1. If operator does not do auto scaling (hpa or vpa) then does the operator allow us to apply a separate hpa.yaml for the awx?

The operator does nothing to prevent you from using HorizontalPodAutoscaling (HPA) or VerticalPodAutoscaling (VPA). Depending on what you want to auto-scale the operator or the operands (the memcached instances in the tutorial) the approach may differ.
For the operator itself: it should run as a singleton. As such there is no point in HPA. You can freely use VPA. VPA does not modify the deployment but the pod definitions so that it is transparent to OLM. If you are using OLM to distribute the operator there is one gotcha. VPA is optional on Kubernetes. If the cluster where the operator is to be deployed does not have the VPA CRD installed the complete InstallPlan will fail and the operator (or the new version of the operator) won't get installed. I have created an enhancement proposal to work around this and allow optional manifests. This is not implemented yet.
For the operands you can use VPA without issue. For HPA I have not experimented with it. You should avoid reconciling spec.replicas. This should be straightforward in the operator logic. Ideally the reconciliation loop would not take place at all when this field is modified but I am not sure how it can be avoided. The difference with VPA is that only the pods and not the deployment are modified by the auto-scaling mechanism.

  1. what are the parameters (cpu, memory, requests, io etc.,) and their threshold (80%, 10K requests etc.,)?

NA, i.e see VPA and HPA documentation

  1. If auto scaling is not in built and also an external hpa configuration is not supported then are there any plans to introduce auto scaling?

NA.

I hope this helps.

@camilamacedo86
Copy link
Contributor

camilamacedo86 commented Dec 13, 2021

Hi @MrAmbiG,

Just to supplement:

You can use HPA(https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)/VPA for the Operands ( what is managed by your Operator ) but not for your Operator/manager pod itself. Why? That would mean more than one process trying to run the same reconciliations.

It is not supported by controller runtime which is the lib responsible by the manager/controllers logic. More info: kubernetes-sigs/controller-runtime#1456

Also, I'd like to highlight the comment: kubernetes-sigs/controller-runtime#1456 (comment)

We haven't really seen a case in the wild where a controller needed parallelism across multiple managers to reach usable performance. Sort of by definition the actions a controller is taking should be very fast. And we already support local concurrency (multiple worker goroutines within a single manager) via simple configuration options. Anything making so many object changes that no one node could keep up, it probably means you shouldn't be writing an operator in the first place. Maybe some day we'll find an actual use case but I think for now this is very much a "no".

To know more about this option see:

You will pass the MaxConcurrentReconciles option to the manager: https://github.com/operator-framework/operator-sdk/blob/master/testdata/go/v3/memcached-operator/main.go#L68-L75

By default, it is 1: https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/controller/controller.go#L37-L38

PS.: We have an open issue to improve the controller-runtime docs: kubernetes-sigs/controller-runtime#1416

@asmacdo asmacdo added language/ansible Issue is related to an Ansible operator project triage/support Indicates an issue that is a support question. labels Dec 13, 2021
@asmacdo asmacdo self-assigned this Dec 13, 2021
@theishshah theishshah added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Jan 10, 2022
@theishshah theishshah added this to the Backlog milestone Jan 10, 2022
@asmacdo
Copy link
Member

asmacdo commented Jan 17, 2022

Closing for the operator-sdk, and I'll reopen as an awx-operator issue.

@asmacdo asmacdo closed this as completed Jan 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. language/ansible Issue is related to an Ansible operator project priority/backlog Higher priority than priority/awaiting-more-evidence. triage/support Indicates an issue that is a support question.
Projects
None yet
Development

No branches or pull requests

6 participants