-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent fetch of azure metricdefinitions and batchApi usage #41790
base: main
Are you sure you want to change the base?
Concurrent fetch of azure metricdefinitions and batchApi usage #41790
Conversation
This pull request is now in conflicts. Could you fix it? 🙏
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
Microsoft.DocumentDb/databaseAccounts (1 resource)resource type:
Activity:
# x-pack/metricbeat/modules.d/azure.yml
- module: azure
metricsets:
- database_account
enabled: true
period: 300s
client_id: '${AZURE_CLIENT_ID:""}'
client_secret: '${AZURE_CLIENT_SECRET:""}'
tenant_id: '${AZURE_TENANT_ID:""}'
subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
refresh_list_interval: 600s
UPDATE: I didn't build the right version, I'm re-testing 9.0.0 8.17.19.0.0
Issues(1) Timegrain for azure.database_account.create_account.count is emptyIn version 8.17.1, the timegrain for this field is PT5M. (2) The azure.database_account.service_availability.avg (timegrain PT1H) is missingVersion 9.0.0 always collects 7 documents with PT5M, while version 8.17.1 collect 7 documents PT5M + 1 document PT1H during the first iteration and again every 60 mins. Is 9.0.0 missing the PT1H document on the first iteration? Waiting for the next iteration to double-check. After 75 mins, no UPDATE: tested by @MichaelKatsoulis I managed to collect |
UPDATE: I built the wrong version, I'm re-testing 9.0.0 with Microsoft.DocumentDb/databaseAccounts (1 resource) and I'll update the previous comment. My apologies for the noise. |
Microsoft.KeyVault/vaults (10 resources)resource type:
Activity:
- module: azure
metricsets:
- monitor
enabled: true
period: 60s
client_id: '${AZURE_CLIENT_ID:""}'
client_secret: '${AZURE_CLIENT_SECRET:""}'
tenant_id: '${AZURE_TENANT_ID:""}'
subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
refresh_list_interval: 600s
resources:
- resource_query: "resourceType eq 'Microsoft.KeyVault/vaults'"
resource_group:
- "mbranca-az-scalability-kv-r10"
metrics:
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
- name: StatusCode
value: '*'
- name: StatusCodeClass
value: '*'
ignore_unsupported: true
name:
- ServiceApiLatency
- Availability
- ServiceApiResult
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
ignore_unsupported: true
name:
- ServiceApiHit
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
- name: TransactionType
value: '*'
ignore_unsupported: true
name:
- SaturationShoebox
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M Notes: When the key vaults are unused (like in this resource group), they only generates a subset of metrics:
8.17.1In progress. I can see the three metrics (Availability, API Hits, API Results), grouped in two documents. So 2 documents x 10 resources = 20 documents per iteration: 9.0.0In progress. First iterations are okay. I get the same number of documents (20) as 8.17.1 and same values. Still checking, but this case looks good. |
@MichaelKatsoulis, I found a couple of issues relate to timegrain in the Microsoft.DocumentDb/databaseAccounts (1 resource) test. |
Microsoft.ContainerRegistry/registries (1 resource)resource type:
Activity:
- module: azure
metricsets:
- container_registry
enabled: true
period: 300s
client_id: '${AZURE_CLIENT_ID:""}'
client_secret: '${AZURE_CLIENT_SECRET:""}'
tenant_id: '${AZURE_TENANT_ID:""}'
subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
refresh_list_interval: 600s Since we had issue with PT1H metrics, I tried another metricset with this timegrain. 8.17.1After one iteration, 8.17.1 collected:
9.0.0After one iteration, 8.17.1 collected:
Conclusion✅ With the recent code changes 8.17.1 and 9.0.0 yield the same outcome. Metrics docs |
Microsoft.KeyVault/vaults (200 resources)resource type:
Activity: I set up a custom Metricbeat config using the Azure Monitor metricset to target the key vaults - module: azure
metricsets:
- monitor
enabled: true
period: 60s
client_id: '${AZURE_CLIENT_ID:""}'
client_secret: '${AZURE_CLIENT_SECRET:""}'
tenant_id: '${AZURE_TENANT_ID:""}'
subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
refresh_list_interval: 600s
resources:
- resource_query: "resourceType eq 'Microsoft.KeyVault/vaults'"
resource_group:
- "mbranca-az-scalability-kv-r200"
metrics:
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
- name: StatusCode
value: '*'
- name: StatusCodeClass
value: '*'
ignore_unsupported: true
name:
- ServiceApiLatency
- Availability
- ServiceApiResult
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
ignore_unsupported: true
name:
- ServiceApiHit
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
- name: TransactionType
value: '*'
ignore_unsupported: true
name:
- SaturationShoebox
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M To have a few metrics from each key value, I'm running the following script: for j in $(seq 1 500); do
for i in $(seq -f "%03g" 1 200); do
az keyvault show --resource-group mbranca-az-scalability-kv-r200 --name "mbrancar200s$i"
echo "Iteration: $j, resource: mbrancar200s$i"
done
done 8.17.1
PRIMPORTANT: don't forget to set - module: azure
metricsets:
- monitor
enabled: true
period: 60s
client_id: '${AZURE_CLIENT_ID:""}'
client_secret: '${AZURE_CLIENT_SECRET:""}'
tenant_id: '${AZURE_TENANT_ID:""}'
subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
refresh_list_interval: 600s
enable_batch_api: true # <—————— 🔥
|
Microsoft.KeyVault/vaults (400 resources)resource type:
Activity: I set up a custom Metricbeat config using the Azure Monitor metricset to target the key vaults - module: azure
metricsets:
- monitor
enabled: true
period: 60s
client_id: '${AZURE_CLIENT_ID:""}'
client_secret: '${AZURE_CLIENT_SECRET:""}'
tenant_id: '${AZURE_TENANT_ID:""}'
subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
refresh_list_interval: 600s
resources:
- resource_query: "resourceType eq 'Microsoft.KeyVault/vaults'"
resource_group:
- "mbranca-az-scalability-kv-r400"
metrics:
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
- name: StatusCode
value: '*'
- name: StatusCodeClass
value: '*'
ignore_unsupported: true
name:
- ServiceApiLatency
- Availability
- ServiceApiResult
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
ignore_unsupported: true
name:
- ServiceApiHit
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
- name: TransactionType
value: '*'
ignore_unsupported: true
name:
- SaturationShoebox
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M To have a few metrics from each key value, I'm running the following script: for j in $(seq 1 500); do
for i in $(seq -f "%03g" 1 400); do
az keyvault show --resource-group mbranca-az-scalability-kv-r400 --name "mbrancar400s$i"
echo "Iteration: $j, resource: mbrancar400s$i"
done
done 8.17.1
PRIMPORTANT: don't forget to set - module: azure
metricsets:
- monitor
enabled: true
period: 60s
client_id: '${AZURE_CLIENT_ID:""}'
client_secret: '${AZURE_CLIENT_SECRET:""}'
tenant_id: '${AZURE_TENANT_ID:""}'
subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
refresh_list_interval: 600s
enable_batch_api: true # <—————— 🔥
Note: metric values collection always take 1m; it takes 3m when it needs to refresh the metric definitions. |
Microsoft.KeyVault/vaults (800 resources)resource type:
Activity: I set up a custom Metricbeat config using the Azure Monitor metricset to target the key vaults - module: azure
metricsets:
- monitor
enabled: true
period: 60s
client_id: '${AZURE_CLIENT_ID:""}'
client_secret: '${AZURE_CLIENT_SECRET:""}'
tenant_id: '${AZURE_TENANT_ID:""}'
subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
refresh_list_interval: 600s
resources:
- resource_query: "resourceType eq 'Microsoft.KeyVault/vaults'"
resource_group:
- "mbranca-az-scalability-kv-r800"
metrics:
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
- name: StatusCode
value: '*'
- name: StatusCodeClass
value: '*'
ignore_unsupported: true
name:
- ServiceApiLatency
- Availability
- ServiceApiResult
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
ignore_unsupported: true
name:
- ServiceApiHit
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
- name: TransactionType
value: '*'
ignore_unsupported: true
name:
- SaturationShoebox
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M To have a few metrics from each key value, I'm running the following script: for j in $(seq 1 500); do
for i in $(seq -f "%03g" 1 800); do
az keyvault show --resource-group mbranca-az-scalability-kv-r800 --name "mbrancar800s$i"
echo "Iteration: $j, resource: mbrancar800s$i"
done
done PRIMPORTANT: don't forget to set - module: azure
metricsets:
- monitor
enabled: true
period: 60s
client_id: '${AZURE_CLIENT_ID:""}'
client_secret: '${AZURE_CLIENT_SECRET:""}'
tenant_id: '${AZURE_TENANT_ID:""}'
subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
refresh_list_interval: 600s
enable_batch_api: true # <—————— 🔥
|
Recap after running a batch of test with 200, 400, and 800 resources with a collection period of 60s.
|
Microsoft.KeyVault/vaults (100 resources)resource type:
Activity: I set up a custom Metricbeat config using the Azure Monitor metricset to target the key vaults - module: azure
metricsets:
- monitor
enabled: true
period: 60s
client_id: '${AZURE_CLIENT_ID:""}'
client_secret: '${AZURE_CLIENT_SECRET:""}'
tenant_id: '${AZURE_TENANT_ID:""}'
subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
refresh_list_interval: 600s
resources:
- resource_query: "resourceType eq 'Microsoft.KeyVault/vaults'"
resource_group:
- "mbranca-az-scalability-kv-r100"
metrics:
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
- name: StatusCode
value: '*'
- name: StatusCodeClass
value: '*'
ignore_unsupported: true
name:
- ServiceApiLatency
- Availability
- ServiceApiResult
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
ignore_unsupported: true
name:
- ServiceApiHit
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M
- dimensions:
- name: ActivityType
value: '*'
- name: ActivityName
value: '*'
- name: TransactionType
value: '*'
ignore_unsupported: true
name:
- SaturationShoebox
namespace: Microsoft.KeyVault/vaults
timegrain: PT1M To have a few metrics from each key value, I'm running the following script: for j in $(seq 1 500); do
for i in $(seq -f "%03g" 1 100); do
az keyvault show --resource-group mbranca-az-scalability-kv-r100 --name "mbrancar100s$i"
echo "Iteration: $j, resource: mbrancar100s$i"
done
done 8.17.1
PRIMPORTANT: don't forget to set - module: azure
metricsets:
- monitor
enabled: true
period: 60s
client_id: '${AZURE_CLIENT_ID:""}'
client_secret: '${AZURE_CLIENT_SECRET:""}'
tenant_id: '${AZURE_TENANT_ID:""}'
subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
refresh_list_interval: 600s
enable_batch_api: true # <—————— 🔥
|
Recap table min / max / avg minutes while collecting Collection period: 1m
Note: the first iteration in the series is slower due to the need to fetch the list of metric definitions. With 50 resources, the delay only happen on start. With more resource (100+), the delay repeats every 10 mins (the default definitions refresh interval). UPDATE: I didn't set the |
@elastic/beats-tech-leads could you also review this PR as you are code owners? There are still some pending updates around unit testing but will be ready until tomorrow. |
Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services) |
The changes affect azure monitor and relevant metricsets. The list of metricsets affected are:
monitor
container_registry
container_instance
container_service
compute_vm
compute_vm_scaleset
database_account
A new configuration parameter is introduced
enable_batch_api
of type boolean.If set to
false
(default) nothing changes in the way the metrics are collected for these metricsets.If set to
true
:metrics of multiple resources with one api call.
Proposed commit message
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Disruptive User Impact
Author's Checklist
How to test this PR locally
Related issues
Use cases
Screenshots
Logs