Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't panic in case of failures but rather retry and report health status #1266

Open
1 of 2 tasks
Tracked by #1800
tillrohrmann opened this issue Mar 6, 2024 · 0 comments
Open
1 of 2 tasks
Tracked by #1800

Comments

@tillrohrmann
Copy link
Contributor

tillrohrmann commented Mar 6, 2024

Since a Node can run multiple Restate components it is no longer a good idea to panic on errors occurring in one component. The problem with this approach is that one failing component will drag all other components down with it. Instead, we want to change the default behavior to retrying a failed operation and additionally report the retrying as part of the health status of this component. Based on this health status, the cluster controller can make global decisions about which components to shut down or migrate.

Tasks

  1. muhamadazmy
  2. muhamadazmy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant