Not able to achieve successful binding rate beyond ~300 pods/sec #71

rishabh325 · 2025-01-07T06:08:24Z

Not able to achieve successful binding rate beyond ~300 pods/sec. When running binder in active-active mode, getting high conflict rate while in active-passive mode overall binding rate is below ~300 pods/sec

Configurations:
Nodes: 30k
Pods: ~150k
Creation Rate: ~1.75k pods/sec via clusterloader

Case 1:

Service Leader Elected Instances Resource Limit

Dispatcher No 4 2 instances w/ 32 cores/180Gi

Scheduler No 2 2 instances w/ 32 cores/250Gi

Binder No 4 2 instances w/ 32 cores/180Gi

.

Case 2:

Service Leader Elected Instances Resource Limit

Dispatcher No 4 2 instances w/ 32 cores/180Gi

Scheduler No 2 2 instances w/ 32 cores/250Gi

Binder Yes 2 2 instances w/ 32 cores/180Gi

binacs · 2025-02-02T05:51:06Z

Apologies for the temporary absence of a deployment guide for multi-instance setups, which may have caused confusion. We are working to improve this documentation as soon as possible.

In the architecture of the Godel distributed scheduler, only one dispatcher and one binder instance are expected to be active at any given time. Multiple dispatcher/binder instances are deployed for high availability and must utilize leader election to prevent conflicts.

For multiple scheduler instances, there are two possible scenarios:

If multiple instances belong to the same shard, they should share the same scheduler name and enable leader election.
If instances belong to different shards, each shard's instances should be assigned a unique scheduler name.

If you have any further questions, please feel free to continue the discussion in the issue.

rishabh325 · 2025-02-10T18:25:47Z

Thanks for getting back on this.

We tried with said configurations i.e.

Enabling leader election on dispatcher/binder
Running multiple instances of scheduler with unique names

Below are the run details:

Nodes: 30k
Pods: ~150k
Creation Rate: ~1.75k pods/sec

Service	Leader Elected	Instances	Resource Limit
Dispatcher	Yes	2	2 instances w/ 32 cores/180Gi
Scheduler	No	5	2 instances w/ 32 cores/250Gi
Binder	Yes	2	2 instances w/ 32 cores/180Gi

$kubectl get leases -n godel-system
NAME HOLDER AGE
binder phx5-z93_0b1bcd50-7822-4497-8270-94bed85d89b6 63d
dispatcher phx5-2tv_0d96b8bb-5f19-4023-9dd7-14f79b59268e 63d

$kubectl get schedulers -A
NAME AGE
godel-scheduler-phx5-3fq 36m
godel-scheduler-phx5-4sc 4d23h
godel-scheduler-phx5-6kt 4d23h
godel-scheduler-phx5-6yq 4d23h
godel-scheduler-phx5-uhp 4d23h

$kubectl get schedulers godel-scheduler-phx5-3fq -o yaml
apiVersion: scheduling.godel.kubewharf.io/v1alpha1
kind: Scheduler
metadata:
creationTimestamp: "2025-02-10T17:35:06Z"
generation: 1
name: godel-scheduler-phx5-3fq
resourceVersion: "30439568215"
uid: b3c4b9d0-d608-4448-a9a7-7dd00c433934
spec: {}
status:
lastUpdateTime: "2025-02-10T18:12:07Z"

The binding rate still hover around ~300 pods/sec. Also, FYI we have enabled DispatcherNodeShuffle feature gate to try and see if enabling node sharding helps.

Can you also share any documentations around various scheduling/queue metrics to see what is being bottleneck here?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not able to achieve successful binding rate beyond ~300 pods/sec #71

Not able to achieve successful binding rate beyond ~300 pods/sec #71

rishabh325 commented Jan 7, 2025

binacs commented Feb 2, 2025

rishabh325 commented Feb 10, 2025 •

edited

Loading

Not able to achieve successful binding rate beyond ~300 pods/sec #71

Not able to achieve successful binding rate beyond ~300 pods/sec #71

Comments

rishabh325 commented Jan 7, 2025

binacs commented Feb 2, 2025

rishabh325 commented Feb 10, 2025 • edited Loading

rishabh325 commented Feb 10, 2025 •

edited

Loading