Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use new autoupdate APIs in discovery service #51758

Merged
merged 9 commits into from
Feb 7, 2025

Conversation

hugoShaka
Copy link
Contributor

@hugoShaka hugoShaka commented Feb 2, 2025

This PR replaces every direct usage of the proxy's autoupdate channels from RFD 109 by agentAutoUpdateVersion() which will look at the RFD-184 autoupdate_agent_rollout resource first, then fallback to RFD-109 proxy channels.

Fixes an issue found during the autoupdate test plan: discovery service and discover UI don't use the version from autoupdate_agent_rollout, this causes version inconsistencies.

There are two gotchas in the PR:

  • version.Getter.GetVersion() returned versions with the leading "v" whileagentAutoUpdateVersion() doesn't have the leading v.
  • The discovery service might not be running in the Teleport proxy nor have access to autoupdate resources. In this case, we build a version client (the same way we do for the kube-agent-updater) and hit the proxy. When building the version client I uncovered an existing bug where the discovery service cannot get the version if there is no proxy in the cluster. I caught the case, added a warning and fellback to the current Teleport version.

Changelog: Scripts installing nodes no longer fail if they cannot discover the autoupdate version, they send a warning and use the version from their own Teleport binary.

Part of: RFD-184
Goal (internal): https://github.com/gravitational/cloud/issues/10289

@hugoShaka hugoShaka force-pushed the hugo/use-right-version-in-discover branch from d1cda36 to f88f765 Compare February 2, 2025 21:37
@hugoShaka hugoShaka changed the title Remove name parameter from proxy version getter Use new autoupdate APIs in discovery sevrice Feb 2, 2025
Comment on lines -60 to -64
// Name implements Getter
func (g ProxyVersionGetter) Name() string {
return g.name
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Narrator voice: Name was not, in fact, part of the version.Getter interface.

@hugoShaka hugoShaka marked this pull request as ready for review February 3, 2025 15:11
@github-actions github-actions bot requested review from creack and tigrato February 3, 2025 15:12
@creack creack changed the title Use new autoupdate APIs in discovery sevrice Use new autoupdate APIs in discovery service Feb 3, 2025
@hugoShaka hugoShaka force-pushed the hugo/use-right-version-in-discover branch from 309d3c5 to eb4b739 Compare February 3, 2025 16:44
@hugoShaka hugoShaka force-pushed the hugo/use-right-version-in-discover branch from eb4b739 to 8f5db63 Compare February 4, 2025 16:43
lib/srv/discovery/kube_integration_watcher.go Outdated Show resolved Hide resolved
lib/srv/discovery/kube_integration_watcher.go Outdated Show resolved Hide resolved
lib/srv/discovery/kube_integration_watcher.go Outdated Show resolved Hide resolved
lib/web/apiserver.go Outdated Show resolved Hide resolved
lib/web/integrations_awsoidc.go Outdated Show resolved Hide resolved
lib/srv/discovery/kube_integration_watcher.go Outdated Show resolved Hide resolved
lib/web/integrations_awsoidc.go Outdated Show resolved Hide resolved
if err != nil {
return "", trace.Wrap(err)
h.logger.ErrorContext(r.Context(), "Cannot read autoupdate target version", "error", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we were terminating the flow on error. Should we do the same? Or should we default to current's teleport.Version ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good question.

With the new autoupdate system I think that we are adding more moving parts and increasing the probability of errors when getting the autoupdate version. I think we should always fallback to the local proxy version in case of error because:

  • it's a better UX to get the install script working instead of a random error "token might be expired", you have a working node/integration
  • by definition we still enable autoupdates, so potential version drifts will get corrected over time, the core feature of updating the nodes/agents is not affected

Based on this discussion I changed the join scripts to also fallback to their own version instead of failing.

@public-teleport-github-review-bot public-teleport-github-review-bot bot removed the request for review from creack February 6, 2025 17:23
@hugoShaka hugoShaka added this pull request to the merge queue Feb 6, 2025
Merged via the queue into master with commit f2ff06a Feb 7, 2025
41 checks passed
@hugoShaka hugoShaka deleted the hugo/use-right-version-in-discover branch February 7, 2025 00:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants