-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: add Helm chart for jobset #785
feature: add Helm chart for jobset #785
Conversation
Welcome @ChenYi015! |
Hi @ChenYi015. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for kubernetes-sigs-jobset canceled.
|
/ok-to-test minor edits needed but this is looking good. I will test it later today. |
3a3e55f
to
51064fa
Compare
/hold Good news is that the helm chart installs without issue. Bad news is that I can't actually run any JobSet with this helm chart. I get panics for some of the examples and others complain about validation. Can you confirm on your end that you can run some of the examples? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for doing this @ChenYi015! I left my initial comments.
Should we also introduce this script to JobSet to sync manifests between
Helm <-> Kustomize ?
51064fa
to
5610e6c
Compare
@kannon92 I have updated the PR and it works fine with example JobSet reconciled properly. |
Signed-off-by: Yi Chen <[email protected]>
Signed-off-by: Yi Chen <[email protected]>
I will raise another issue to track it and implement it later. |
For the initial version of the jobset Helm chart, I think this PR is ready to be merged. The values is simplified to keep only a minimum set of params. In the future, we will extend the configurations if users request. For the syncing between Helm and Kustomize, the Helm chart related CI workflows and the CRDs upgrading approach, we can implement it in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this effort @ChenYi015!
I left a few small comments. We should remove values from the manifests that we don't use currently. For example:
{{ include "jobset.controller.serviceAccount.name" . }}
I think, after it we should be ready to merge it.
leaderElection: | ||
# -- Whether to enable leader election for jobset controller. | ||
enable: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please can we move the configurations that can be set with Config API to the managerConfig
section in the Helm Chart values ? Similar to Kueue: https://github.com/kubernetes-sigs/kueue/blob/main/charts/kueue/values.yaml#L70-L72.
I believe, that makes it clearer for users that these parameters are part of manager config.
WDYT @ChenYi015 @tenzen-y @kannon92 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my experience, I think we do not have to expose the Config API to end users, it will incur more complexity to configure the chart. Besides, for example, if one tries to configure the webhook service name and secret name by leveraging Config API, it will not work since the actual name of webhook service and secret is templated by Helm.
jobset/api/config/v1alpha1/configuration_types.go
Lines 105 to 118 in 943da8b
type InternalCertManagement struct { | |
// Enable controls whether to enable internal cert management or not. | |
// Defaults to true. If you want to use a third-party management, e.g. cert-manager, | |
// set it to false. See the user guide for more information. | |
Enable *bool `json:"enable,omitempty"` | |
// WebhookServiceName is the name of the Service used as part of the DNSName. | |
// Defaults to jobset-webhook-service. | |
WebhookServiceName *string `json:"webhookServiceName,omitempty"` | |
// WebhookSecretName is the name of the Secret used to store CA and server certs. | |
// Defaults to jobset-webhook-server-cert. | |
WebhookSecretName *string `json:"webhookSecretName,omitempty"` | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if one tries to configure the webhook service name and secret name by leveraging Config API, it will not work since the actual name of webhook service and secret is templated by Helm.
I guess, in that case we should duplicate information for Config and for actual k8s resources, like here: https://github.com/kubernetes-sigs/kueue/blob/main/charts/kueue/values.yaml#L148-L157
I just feel that if we don't have Config YAML in the Chart values, it would be hard for user to understand that some Values are going to be inserted into Manager Config.
Maybe you know about better solution ?@astefanutti @tenzen-y @dongjiang1989
Signed-off-by: Yi Chen <[email protected]>
certManager: | ||
# -- Whether to use cert-manager to generate certificates for the jobset webhook. | ||
enable: false | ||
|
||
# -- The reference to the issuer. | ||
# If empty, self-signed issuer will be created and used. | ||
issuerRef: {} | ||
# name: selfsigned | ||
# kind: ClusterIssuer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For cert-manager, these params are kept to allow users to enable/disable cert-manager and use a custom issuer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this great contribution @ChenYi015!
I think, we can address the remaining comments in the followup PRs.
/lgtm
/assign @kannon92 @ahg-g @tenzen-y
@andreyvelich: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Signed-off-by: Yi Chen <[email protected]>
Thanks a lot @ChenYi015 and for @andreyvelich for the thorough review and I hope we continue this collaboration and you folks become maintainers on this repo as well if you like :) I will leave it to @kannon92 to approve since he was the primary maintainer looking at this PR. |
|
||
prometheus: | ||
# -- Whether to enable Prometheus metrics exporting. | ||
enable: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you disable this by default?
Kustomize doesn't require this.
We can do this as a follow up.
/lgtm I tested this and other than prometheus being required by default, it seemed to work fine! We can address the prometheus as a follow up but I was able to test by disabling so I think this is good to go. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ChenYi015, kannon92 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
@ChenYi015 @andreyvelich |
I am going to raise another PR today to disable prometheus metrics exporting by default in Helm chart. Besides, I think we need to publish the Helm chart so users can install the chart like: helm repo add jobset https://kuberentes-sigs.github.io/jobset
helm install jobset jobset/jobset \
--namespace jobset-system \
--create-namespace |
@ChenYi015 How do we want to publish them ? Similar to KServe and Spark Operator ? |
cc @ahg-g im happy to wait until we have a site for the helm chart but not sure what your customers want. |
@tenzen-y do we publish the charts in Kueue anywhere? |
Sorry for the late response. I think there is another way to publish the charts, see #790 (comment). |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Which issue(s) this PR fixes:
Close #726
Special notes for your reviewer:
As dicussed in kubeflow/trainer#2435 (comment), this PR is based on #760.
One can install the chart as follows:
For Helm chart development, one can generate the Helm chart
README.md
bymake helm-docs
and run the Helm unit tests bymake helm-unittest
.Does this PR introduce a user-facing change?