Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove taint node.ocs.openshift.io/storage=true:NoSchedule from nodes during teardown of test_non_ocs_taint_and_tolerations #11456

Open
am-agrawa opened this issue Feb 20, 2025 · 0 comments

Comments

@am-agrawa
Copy link
Contributor

All nodes are tainted by node.ocs.openshift.io/storage=true:NoSchedule during test_non_ocs_taint_and_tolerations hence the workload pod couldn't reach Running state in the following test. We should remove this taint during teardown of the test so that subsequent failures can be avoided.

E.g.
In the same run, test_rolling_shutdown_and_recovery failed:

2025-01-04 09:30:26  23:00:25 - MainThread - ocs_ci.utility.utils - INFO  - Executing command: oc --kubeconfig /home/jenkins/current-cluster-dir/openshift-cluster-dir/auth/kubeconfig -n namespace-test-da4558c57f524a4fbef3c6e17 get Pod pod-test-rbd-10940c87cd414a149be90cbba73 -n namespace-test-da4558c57f524a4fbef3c6e17
2025-01-04 09:30:26  23:00:25 - MainThread - ocs_ci.ocs.ocp - INFO  - status of pod-test-rbd-10940c87cd414a149be90cbba73 at column STATUS was Pending, but we were waiting for Running
2025-01-04 09:30:26  23:00:25 - MainThread - ocs_ci.ocs.ocp - ERROR  - timeout expired: Timed out after 300s running get("pod-test-rbd-10940c87cd414a149be90cbba73", True, None)
2025-01-04 09:30:26  23:00:25 - MainThread - ocs_ci.utility.utils - INFO  - Executing command: oc -n namespace-test-da4558c57f524a4fbef3c6e17 describe Pod pod-test-rbd-10940c87cd414a149be90cbba73
2025-01-04 09:30:26  23:00:26 - MainThread - ocs_ci.ocs.ocp - WARNING  - Description of the resource(s) we were waiting for:
2025-01-04 09:30:26  Name:             pod-test-rbd-10940c87cd414a149be90cbba73
2025-01-04 09:30:26  Namespace:        namespace-test-da4558c57f524a4fbef3c6e17
2025-01-04 09:30:26  Priority:         0
2025-01-04 09:30:26  Service Account:  default
2025-01-04 09:30:26  Node:             <none>
2025-01-04 09:30:26  Labels:           <none>
2025-01-04 09:30:26  Annotations:      openshift.io/scc: anyuid
2025-01-04 09:30:26  Status:           Pending
2025-01-04 09:30:26  IP:               
2025-01-04 09:30:26  IPs:              <none>
2025-01-04 09:30:26  Containers:
2025-01-04 09:30:26    web-server:
2025-01-04 09:30:26      Image:        quay.io/ocsci/nginx:fio
2025-01-04 09:30:26      Port:         <none>
2025-01-04 09:30:26      Host Port:    <none>
2025-01-04 09:30:26      Environment:  <none>
2025-01-04 09:30:26      Mounts:
2025-01-04 09:30:26        /var/lib/www/html from mypvc (rw)
2025-01-04 09:30:26        /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9qgzm (ro)
2025-01-04 09:30:26  Conditions:
2025-01-04 09:30:26    Type           Status
2025-01-04 09:30:26    PodScheduled   False 
2025-01-04 09:30:26  Volumes:
2025-01-04 09:30:26    mypvc:
2025-01-04 09:30:26      Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
2025-01-04 09:30:26      ClaimName:  pvc-test-8c43d351e6364cd8a28cce5078383c6
2025-01-04 09:30:26      ReadOnly:   false
2025-01-04 09:30:26    kube-api-access-9qgzm:
2025-01-04 09:30:26      Type:                    Projected (a volume that contains injected data from multiple sources)
2025-01-04 09:30:26      TokenExpirationSeconds:  3607
2025-01-04 09:30:26      ConfigMapName:           kube-root-ca.crt
2025-01-04 09:30:26      ConfigMapOptional:       <nil>
2025-01-04 09:30:26      DownwardAPI:             true
2025-01-04 09:30:26      ConfigMapName:           openshift-service-ca.crt
2025-01-04 09:30:26      ConfigMapOptional:       <nil>
2025-01-04 09:30:26  QoS Class:                   BestEffort
2025-01-04 09:30:26  Node-Selectors:              <none>
2025-01-04 09:30:26  Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
2025-01-04 09:30:26                               node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
2025-01-04 09:30:26  Events:
2025-01-04 09:30:26    Type     Reason            Age   From               Message
2025-01-04 09:30:26    ----     ------            ----  ----               -------
2025-01-04 09:30:26    Warning  FailedScheduling  5m1s  default-scheduler  0/6 nodes are available: 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 3 node(s) had untolerated taint {node.ocs.openshift.io/storage: true}. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.

The failure occurred on ODF 4.18.0-121

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant