Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add job to run ClusterLoader2 load test on 100 node CAPZ cluster #33423

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
---
periodics:
- interval: 30m
name: ci-kubernetes-e2e-azure-scalability
cluster: eks-prow-build-cluster
decorate: true
decoration_config:
timeout: 8h
path_alias: k8s.io/perf-tests
tags:
- "perfDashPrefix: azure-100Nodes-master"
- "perfDashJobType: performance"
- "perfDashBuildsCount: 500"
labels:
preset-dind-enabled: "true"
preset-kind-volume-mounts: "true"
preset-capz-containerd-1-7-latest: "true"
preset-azure-community: "true"
extra_refs:
- org: kubernetes
repo: perf-tests
base_ref: "master"
path_alias: "k8s.io/perf-tests"
- org: kubernetes-sigs
repo: cluster-api-provider-azure
base_ref: main # TODO: prow-load template is only on main ATM.
path_alias: "sigs.k8s.io/cluster-api-provider-azure"
workdir: true
spec:
serviceAccountName: azure
containers:
- image: gcr.io/k8s-staging-test-infra/kubekins-e2e:v20240803-cf1183f2db-master
command:
- runner.sh
- ./scripts/ci-entrypoint.sh
args:
- bash
- -c
- >-
cd ${GOPATH}/src/k8s.io/perf-tests/ &&
./run-e2e.sh cluster-loader2
--nodes=100 \
--prometheus-scrape-kubelets=true \
--prometheus-scrape-node-exporter \
--provider=aks \
Jont828 marked this conversation as resolved.
Show resolved Hide resolved
--testconfig=testing/load/config.yaml \
--testconfig=testing/huge-service/config.yaml \
--testconfig=testing/access-tokens/config.yaml \
--testoverrides=./testing/experiments/enable_restart_count_check.yaml \
--testoverrides=./testing/experiments/use_simple_latency_query.yaml \
--testoverrides=./testing/overrides/load_throughput.yaml \
--v=2
securityContext:
privileged: true
env:
# CAPZ variables
- name: CLUSTER_TEMPLATE
value: "test/ci/cluster-template-prow-load.yaml"
- name: NODE_MACHINE_TYPE
value: "Standard_D16s_v3"
- name: TEST_WINDOWS
value: "false"
- name: KUBERNETES_VERSION
value: "v1.25.3"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jont828 will we run the test with K8s 1.25.3 ??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently on the perf-dash, they have tests for 1.28, 1.29, 1.30, 1.31, and master/main. I'll see if we can run this off of the master branch.

- name: "CONTROL_PLANE_MACHINE_COUNT"
value: "5"
- name: WINDOWS_WORKER_MACHINE_COUNT
value: "0" # Don't create windows workers
- name: WORKER_MACHINE_COUNT
value: "100"
- name: CL2_POD_COUNT
value: "10"
# clusterloader2 variables
- name: ENABLE_PROMETHEUS_SERVER
value: "true"
- name: PROMETHEUS_SCRAPE_APISERVER_ONLY
value: "true"
- name: PROMETHEUS_APISERVER_SCRAPE_PORT
value: "6443"
- name: PROMETHEUS_SCRAPE_WINDOWS_NODE_EXPORTER
value: "true"
- name: CL2_PROMETHEUS_TOLERATE_MASTER
value: "true"
# from google cl2
- name: CL2_ENABLE_DNS_PROGRAMMING
value: "true"
- name: CL2_SCHEDULER_THROUGHPUT_THRESHOLD
value: "0"
- name: CL2_ENABLE_API_AVAILABILITY_MEASUREMENT
value: "true"
- name: CL2_API_AVAILABILITY_PERCENTAGE_THRESHOLD
value: "99.5"
# azuredisk variables - required for Prometheus PVC
- name: DEPLOY_AZURE_CSI_DRIVER
value: "true"
- name: AZUREDISK_CSI_DRIVER_VERSION
value: "master"
- name: PROMETHEUS_STORAGE_CLASS_PROVISIONER
value: "kubernetes.io/azure-disk"
- name: PROMETHEUS_STORAGE_CLASS_VOLUME_TYPE
value: "StandardSSD_LRS"
resources:
requests:
cpu: "2"
memory: "9Gi"
limits:
cpu: "2"
memory: "9Gi"
annotations:
testgrid-dashboards: sig-scalability-azure
testgrid-tab-name: azure-master-scalability-100
description: "Run clusterloader2 load test on a 100 node CAPZ cluster"
testgrid-num-columns-recent: '30'
2 changes: 2 additions & 0 deletions config/testgrids/kubernetes/sig-scalability/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ dashboard_groups:
- name: sig-scalability
dashboard_names:
- sig-scalability-aws
- sig-scalability-azure
- sig-scalability-gce
- sig-scalability-node
- sig-scalability-kubemark
Expand All @@ -13,6 +14,7 @@ dashboard_groups:

dashboards:
- name: sig-scalability-aws
- name: sig-scalability-azure
- name: sig-scalability-gce
- name: sig-scalability-kubemark
- name: sig-scalability-node
Expand Down