-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid flake unit tests #164
Conversation
8f07b5c
to
7b3c6cb
Compare
Added some debug log and discovered that during the cleanup phase the controller continue to reconcile the resource:
I think we should look for a better way to clean test resources. Is there a way to cleanly remove MetalLB from a cluster using only API resource? |
return err | ||
} | ||
// Wait for the deletion queue to be empty, as the delete method is asynchronous in k8s APIs. | ||
err = retry.Do( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is normally done with gomega's Eventually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, but this function does not have Ginkgo assertion yet, so I'm going to return error without retries and put the eventually in caller side. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me
It would make sense for the controller to keep reconciling, if we did not have removed the MetalLB resource. |
As far as I discovered, the envtest package does not include built-in controllers https://book.kubebuilder.io/reference/envtest.html#testing-considerations So objects need to be deleted manually, unless implementing some workaround like this Anyway, I need to dig deeper to understand the test flakyness and I'm going to mark this PR as a draft |
8ae38cf
to
035816a
Compare
Digging deeper I discovered that MetalLB and ConfigMap reconcilers continue working during resource cleanup and there is no easy way to synchronize test code with production code, unless introducing a sort of "reconciler state" and exposing it to test. So, the safest way to avoid flakes is to wrap assertions in |
Setting environment variable in test causes tests to be order dependent, since SPEKAER_IMAGE, CONTROLLER_IMAGE, FRR_IMAGE and KUBE_RBAC_PROXY_IMAGE are mandatory for the controller to work without errors. Avoid set WATCH_NAMESPACE environment variable as it's not used during tests. Signed-off-by: Andrea Panattoni <[email protected]>
MetalLBReconciler and ConfigMapReconciler continue reconciling even during the AfterEach phase of tests. Since test code can't know if a reconciler is still working, there is no way to safely and completely delete all resource during cleanup. So it's better to wrap assertion in `Eventually()` call. Signed-off-by: Andrea Panattoni <[email protected]>
035816a
to
38ab6c9
Compare
Digging deeper I discovered that MetalLB and ConfigMap reconciliers continue working during resource cleanup and there is no |
another instance of same CI flake with this run https://github.com/metallb/metallb-operator/runs/5378870296?check_suite_focus=true |
LGTM, thanks! |
* Set environment variables during suite setup Setting environment variable in test causes tests to be order dependent, since SPEKAER_IMAGE, CONTROLLER_IMAGE, FRR_IMAGE and KUBE_RBAC_PROXY_IMAGE are mandatory for the controller to work without errors. Avoid set WATCH_NAMESPACE environment variable as it's not used during tests. Signed-off-by: Andrea Panattoni <[email protected]> * Retry asserts in controller tests MetalLBReconciler and ConfigMapReconciler continue reconciling even during the AfterEach phase of tests. Since test code can't know if a reconciler is still working, there is no way to safely and completely delete all resource during cleanup. So it's better to wrap assertion in `Eventually()` call. Signed-off-by: Andrea Panattoni <[email protected]> (cherry picked from commit cfd950d)
OCPBUGS-14503: [4.10] Change the memberlist secret to contain path
This to fix the flakiness of some unit test, as happened in #163