Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent fetch of azure metricdefinitions and batchApi usage #41790

Open
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

MichaelKatsoulis
Copy link
Contributor

@MichaelKatsoulis MichaelKatsoulis commented Nov 26, 2024

The changes affect azure monitor and relevant metricsets. The list of metricsets affected are:

  • monitor
  • container_registry
  • container_instance
  • container_service
  • compute_vm
  • compute_vm_scaleset
  • database_account

A new configuration parameter is introduced enable_batch_api of type boolean.
If set to false(default) nothing changes in the way the metrics are collected for these metricsets.

If set to true:

  • The metric definitions of resources are collected asynchronously and write the results in a channel.
  • The channel is read and when the number of definitions collected reach 50 (batch API limit)
  • The metrics definitions are grouped based on criteria(1) and the azure BatchAPI is used to retrieve
    metrics of multiple resources with one api call.
  1. Grouping criteria are
  • Namespace
  • SubscriptionID
  • Location
  • Names
  • TimeGrain
  • Dimensions

Proposed commit message

  • WHAT: Introduce enable_batch_api parameter for concurrent fetching of azure metric definitions and metric values collection using Batch Api
  • WHY: Helps mitigating scalability problems

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@MichaelKatsoulis MichaelKatsoulis requested review from a team as code owners November 26, 2024 12:08
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Nov 26, 2024
@MichaelKatsoulis MichaelKatsoulis marked this pull request as draft November 26, 2024 12:08
Copy link
Contributor

mergify bot commented Nov 26, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b concurrent-fetch-of-azure-metricdefinitions upstream/concurrent-fetch-of-azure-metricdefinitions
git merge upstream/main
git push upstream concurrent-fetch-of-azure-metricdefinitions

Copy link
Contributor

mergify bot commented Nov 26, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @MichaelKatsoulis? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit

Copy link
Contributor

mergify bot commented Nov 26, 2024

backport-8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Nov 26, 2024
@zmoog
Copy link
Contributor

zmoog commented Jan 10, 2025

Microsoft.DocumentDb/databaseAccounts (1 resource)

resource type: Microsoft.DocumentDb/databaseAccounts
resource count: 1 resource
versions tested:

  • 8.17.1 (branch 8.17)
  • 9.0.0 (branch MichaelKatsoulis:concurrent-fetch-of-azure-metricdefinitions)

Activity:

  • I created one "Azure Cosmos DB for NoSQL", with Provisioned throughput (default settings)
  • I set up the standard Metricbeat database account module
# x-pack/metricbeat/modules.d/azure.yml
- module: azure
  metricsets:
  - database_account
  enabled: true
  period: 300s
  client_id: '${AZURE_CLIENT_ID:""}'
  client_secret: '${AZURE_CLIENT_SECRET:""}'
  tenant_id: '${AZURE_TENANT_ID:""}'
  subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
  refresh_list_interval: 600s
  • 8.17.1 and 9.0.0 are creating the same metrics (cardinality and values).

UPDATE: I didn't build the right version, I'm re-testing 9.0.0

8.17.1

CleanShot 2025-01-10 at 13 16 51@2x

9.0.0

  • Data collected regularly: yes

Issues

(1) Timegrain for azure.database_account.create_account.count is empty

CleanShot 2025-01-10 at 15 49 18@2x

In version 8.17.1, the timegrain for this field is PT5M.

(2) The azure.database_account.service_availability.avg (timegrain PT1H) is missing

Version 9.0.0 always collects 7 documents with PT5M, while version 8.17.1 collect 7 documents PT5M + 1 document PT1H during the first iteration and again every 60 mins.

Is 9.0.0 missing the PT1H document on the first iteration? Waiting for the next iteration to double-check.

After 75 mins, no azure.database_account.service_availability.avg field with PT1H.

CleanShot 2025-01-10 at 16 30 53@2x

UPDATE: tested by @MichaelKatsoulis

I managed to collect azure.database_account.service_availability.avg field with PT1H with the PR code. The problem is that the API requests metric values for metrics ServiceAvailability and ReplicationLatency for Average aggregation. When values for both metrics are requested, service_availability.avg is always nil. If we remove the ReplicationLatency and we just request values for ServiceAvailability the service_availability.avg is returned ok! Still do not know the reason of that.

@zmoog
Copy link
Contributor

zmoog commented Jan 10, 2025

UPDATE: I built the wrong version, I'm re-testing 9.0.0 with Microsoft.DocumentDb/databaseAccounts (1 resource) and I'll update the previous comment.

My apologies for the noise.

@zmoog
Copy link
Contributor

zmoog commented Jan 10, 2025

Microsoft.KeyVault/vaults (10 resources)

resource type: Microsoft.KeyVault/vaults
resource count: 10 resources
versions tested:

  • 8.17.1 (branch 8.17)
  • 9.0.0 (branch MichaelKatsoulis:concurrent-fetch-of-azure-metricdefinitions)

Activity:

  • I set up a custom Metricbeat config using the Azure Monitor metricset to target the key vaults
- module: azure  
  metricsets:  
    - monitor  
  enabled: true  
  period: 60s  
  client_id: '${AZURE_CLIENT_ID:""}'
  client_secret: '${AZURE_CLIENT_SECRET:""}'
  tenant_id: '${AZURE_TENANT_ID:""}'
  subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
  refresh_list_interval: 600s  
  resources:  
  - resource_query: "resourceType eq 'Microsoft.KeyVault/vaults'"  
    resource_group:  
    - "mbranca-az-scalability-kv-r10"    
    metrics:  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
          - name: StatusCode  
            value: '*'  
          - name: StatusCodeClass  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - ServiceApiLatency  
          - Availability  
          - ServiceApiResult  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - ServiceApiHit  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
          - name: TransactionType  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - SaturationShoebox  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M

Notes:

When the key vaults are unused (like in this resource group), they only generates a subset of metrics:

  • Availability
  • API Hits
  • API Results.

8.17.1

In progress.

I can see the three metrics (Availability, API Hits, API Results), grouped in two documents. So 2 documents x 10 resources = 20 documents per iteration:

CleanShot 2025-01-10 at 16 35 28@2x

9.0.0

In progress.

First iterations are okay. I get the same number of documents (20) as 8.17.1 and same values.

CleanShot 2025-01-10 at 16 48 50@2x

Still checking, but this case looks good.

@zmoog
Copy link
Contributor

zmoog commented Jan 10, 2025

@MichaelKatsoulis, I found a couple of issues relate to timegrain in the Microsoft.DocumentDb/databaseAccounts (1 resource) test.

@zmoog
Copy link
Contributor

zmoog commented Jan 10, 2025

Microsoft.ContainerRegistry/registries (1 resource)

resource type: Microsoft.ContainerRegistry/registries
resource count: 1 resource
versions tested:

  • 8.17.1 (branch 8.17)
  • 9.0.0 (branch MichaelKatsoulis:concurrent-fetch-of-azure-metricdefinitions)

Activity:

  • I set up a custom Metricbeat config using the Azure Monitor metricset to target the key vaults
- module: azure
  metricsets:
  - container_registry
  enabled: true
  period: 300s
  client_id: '${AZURE_CLIENT_ID:""}'
  client_secret: '${AZURE_CLIENT_SECRET:""}'
  tenant_id: '${AZURE_TENANT_ID:""}'
  subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
  refresh_list_interval: 600s

Since we had issue with PT1H metrics, I tried another metricset with this timegrain.

8.17.1

After one iteration, 8.17.1 collected:

  • 1 document with PT5M every 5 minutes
  • 1 document with PT1H every 60 minutes

9.0.0

After one iteration, 8.17.1 collected:

  • 1 document with PT5M every 5 minutes
  • 1 document with PT1H every 60 minutes

Conclusion

✅ With the recent code changes 8.17.1 and 9.0.0 yield the same outcome.

CleanShot 2025-01-15 at 13 23 47@2x

Metrics docs

@zmoog
Copy link
Contributor

zmoog commented Jan 24, 2025

Microsoft.KeyVault/vaults (200 resources)

resource type: Microsoft.KeyVault/vaults
resource count: 200 resources
versions tested:

  • 8.17.1
  • PR (branch MichaelKatsoulis:concurrent-fetch-of-azure-metricdefinitions, commit 54d4c03)

Activity:

I set up a custom Metricbeat config using the Azure Monitor metricset to target the key vaults

- module: azure  
  metricsets:  
    - monitor  
  enabled: true  
  period: 60s  
  client_id: '${AZURE_CLIENT_ID:""}'
  client_secret: '${AZURE_CLIENT_SECRET:""}'
  tenant_id: '${AZURE_TENANT_ID:""}'
  subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
  refresh_list_interval: 600s  
  resources:  
  - resource_query: "resourceType eq 'Microsoft.KeyVault/vaults'"  
    resource_group:  
    - "mbranca-az-scalability-kv-r200"
    metrics:  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
          - name: StatusCode  
            value: '*'  
          - name: StatusCodeClass  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - ServiceApiLatency  
          - Availability  
          - ServiceApiResult  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - ServiceApiHit  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
          - name: TransactionType  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - SaturationShoebox  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M

To have a few metrics from each key value, I'm running the following script:

for j in $(seq 1 500); do                                                                                                           
    for i in $(seq -f "%03g" 1 200); do
        az keyvault show --resource-group mbranca-az-scalability-kv-r200 --name "mbrancar200s$i"
        echo "Iteration: $j, resource: mbrancar200s$i"
    done
done

8.17.1

Metric Value Description
Gap (min) 4 Minimum time between collections
Gap (max) 7 Maximum time between collections
Gap (avg) 5.4m (7, 4, 6, 5, 5) Average time between collections

CleanShot 2025-01-26 at 22 43 15@2x

PR

IMPORTANT: don't forget to set enable_batch_api: true in the config file when you runt the PR code.

- module: azure  
  metricsets:  
    - monitor  
  enabled: true  
  period: 60s  
  client_id: '${AZURE_CLIENT_ID:""}'
  client_secret: '${AZURE_CLIENT_SECRET:""}'
  tenant_id: '${AZURE_TENANT_ID:""}'
  subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
  refresh_list_interval: 600s  
  enable_batch_api: true # <—————— 🔥
Metric Value Description
Gap (min) 1 Minimum time between collections
Gap (max) 2 Maximum time between collections
Gap (avg) 1.2 (2, 1, 1, 1, 1, 1, 2) Average time between collections

CleanShot 2025-01-27 at 13 42 11@2x

@zmoog
Copy link
Contributor

zmoog commented Jan 24, 2025

Microsoft.KeyVault/vaults (400 resources)

resource type: Microsoft.KeyVault/vaults
resource count: 400 resources
versions tested:

  • 8.17.1
  • PR (branch MichaelKatsoulis:concurrent-fetch-of-azure-metricdefinitions, commit 54d4c03)

Activity:

I set up a custom Metricbeat config using the Azure Monitor metricset to target the key vaults

- module: azure  
  metricsets:  
    - monitor  
  enabled: true  
  period: 60s  
  client_id: '${AZURE_CLIENT_ID:""}'
  client_secret: '${AZURE_CLIENT_SECRET:""}'
  tenant_id: '${AZURE_TENANT_ID:""}'
  subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
  refresh_list_interval: 600s  
  resources:  
  - resource_query: "resourceType eq 'Microsoft.KeyVault/vaults'"  
    resource_group:  
    - "mbranca-az-scalability-kv-r400"
    metrics:  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
          - name: StatusCode  
            value: '*'  
          - name: StatusCodeClass  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - ServiceApiLatency  
          - Availability  
          - ServiceApiResult  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - ServiceApiHit  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
          - name: TransactionType  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - SaturationShoebox  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M

To have a few metrics from each key value, I'm running the following script:

for j in $(seq 1 500); do                                                                                                           
    for i in $(seq -f "%03g" 1 400); do
        az keyvault show --resource-group mbranca-az-scalability-kv-r400 --name "mbrancar400s$i"
        echo "Iteration: $j, resource: mbrancar400s$i"
    done
done

8.17.1

Metric Value Description
Gap (min) 11m Minimum time between collections
Gap (max) 13m Maximum time between collections
Gap (avg) 11.6m (12, 13, 11, 11 ,11) Average time between collections

CleanShot 2025-01-27 at 00 05 25@2x

PR

IMPORTANT: don't forget to set enable_batch_api: true in the config file when you runt the PR code.

- module: azure  
  metricsets:  
    - monitor  
  enabled: true  
  period: 60s  
  client_id: '${AZURE_CLIENT_ID:""}'
  client_secret: '${AZURE_CLIENT_SECRET:""}'
  tenant_id: '${AZURE_TENANT_ID:""}'
  subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
  refresh_list_interval: 600s  
  enable_batch_api: true # <—————— 🔥
Metric Value Description
Gap (min) 1 Minimum time between collections
Gap (max) 3 Maximum time between collections
Gap (avg) (3, 1, 1, 1, 1, 1, 1, 1, 3) Average time between collections

Note: metric values collection always take 1m; it takes 3m when it needs to refresh the metric definitions.

CleanShot 2025-01-27 at 12 54 15@2x

@zmoog
Copy link
Contributor

zmoog commented Jan 24, 2025

Microsoft.KeyVault/vaults (800 resources)

resource type: Microsoft.KeyVault/vaults
resource count: 800 resources
versions tested:

  • PR (branch MichaelKatsoulis:concurrent-fetch-of-azure-metricdefinitions, commit 54d4c03)

Activity:

I set up a custom Metricbeat config using the Azure Monitor metricset to target the key vaults

- module: azure  
  metricsets:  
    - monitor  
  enabled: true  
  period: 60s  
  client_id: '${AZURE_CLIENT_ID:""}'
  client_secret: '${AZURE_CLIENT_SECRET:""}'
  tenant_id: '${AZURE_TENANT_ID:""}'
  subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
  refresh_list_interval: 600s  
  resources:  
  - resource_query: "resourceType eq 'Microsoft.KeyVault/vaults'"  
    resource_group:  
    - "mbranca-az-scalability-kv-r800"
    metrics:  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
          - name: StatusCode  
            value: '*'  
          - name: StatusCodeClass  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - ServiceApiLatency  
          - Availability  
          - ServiceApiResult  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - ServiceApiHit  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
          - name: TransactionType  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - SaturationShoebox  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M

To have a few metrics from each key value, I'm running the following script:

for j in $(seq 1 500); do                                                                                                           
    for i in $(seq -f "%03g" 1 800); do
        az keyvault show --resource-group mbranca-az-scalability-kv-r800 --name "mbrancar800s$i"
        echo "Iteration: $j, resource: mbrancar800s$i"
    done
done

PR

IMPORTANT: don't forget to set enable_batch_api: true in the config file when you runt the PR code.

- module: azure  
  metricsets:  
    - monitor  
  enabled: true  
  period: 60s  
  client_id: '${AZURE_CLIENT_ID:""}'
  client_secret: '${AZURE_CLIENT_SECRET:""}'
  tenant_id: '${AZURE_TENANT_ID:""}'
  subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
  refresh_list_interval: 600s  
  enable_batch_api: true # <—————— 🔥
Metric Value Description
Gap (min) 1 Minimum time between collections
Gap (max) 5 Maximum time between collections
Gap (avg) 1.8 (5, 1, 1, 1, 1, 1, 4) Average time between collections

CleanShot 2025-01-27 at 13 17 00@2x

@zmoog
Copy link
Contributor

zmoog commented Jan 24, 2025

Recap after running a batch of test with 200, 400, and 800 resources with a collection period of 60s.

Resources Average gap Time per resource ( gap / resources)
200 5m 1.5s
400 11m 1.65s
800 786 23m 1.75s

@zmoog
Copy link
Contributor

zmoog commented Jan 26, 2025

Microsoft.KeyVault/vaults (100 resources)

resource type: Microsoft.KeyVault/vaults
resource count: 100 resources
versions tested:

  • 8.17.1
  • PR (branch MichaelKatsoulis:concurrent-fetch-of-azure-metricdefinitions, commit 54d4c03)

Activity:

I set up a custom Metricbeat config using the Azure Monitor metricset to target the key vaults

- module: azure  
  metricsets:  
    - monitor  
  enabled: true  
  period: 60s  
  client_id: '${AZURE_CLIENT_ID:""}'
  client_secret: '${AZURE_CLIENT_SECRET:""}'
  tenant_id: '${AZURE_TENANT_ID:""}'
  subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
  refresh_list_interval: 600s  
  resources:  
  - resource_query: "resourceType eq 'Microsoft.KeyVault/vaults'"  
    resource_group:  
    - "mbranca-az-scalability-kv-r100"    
    metrics:  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
          - name: StatusCode  
            value: '*'  
          - name: StatusCodeClass  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - ServiceApiLatency  
          - Availability  
          - ServiceApiResult  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - ServiceApiHit  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M  
      - dimensions:  
          - name: ActivityType  
            value: '*'  
          - name: ActivityName  
            value: '*'  
          - name: TransactionType  
            value: '*'  
        ignore_unsupported: true  
        name:  
          - SaturationShoebox  
        namespace: Microsoft.KeyVault/vaults  
        timegrain: PT1M

To have a few metrics from each key value, I'm running the following script:

for j in $(seq 1 500); do                                                                                                           
    for i in $(seq -f "%03g" 1 100); do
        az keyvault show --resource-group mbranca-az-scalability-kv-r100 --name "mbrancar100s$i"
        echo "Iteration: $j, resource: mbrancar100s$i"
    done
done

8.17.1

Metric Value Description
Gap (min) 2m Minimum time between collections
Gap (max) 4m Maximum time between collections
Gap (avg) ~2.6m (4, 3, 2, 2, 3, 2) Average time between collections

CleanShot 2025-01-26 at 22 04 29@2x

PR

IMPORTANT: don't forget to set enable_batch_api: true in the config file when you runt the PR code.

- module: azure  
  metricsets:  
    - monitor  
  enabled: true  
  period: 60s  
  client_id: '${AZURE_CLIENT_ID:""}'
  client_secret: '${AZURE_CLIENT_SECRET:""}'
  tenant_id: '${AZURE_TENANT_ID:""}'
  subscription_id: '${AZURE_SUBSCRIPTION_ID:""}'
  refresh_list_interval: 600s  
  enable_batch_api: true # <—————— 🔥
Metric Value Description
Gap (min) 1 Minimum time between collections
Gap (max) 2 Maximum time between collections
Gap (avg) 1.2 (2, 1, 1, 1, 1) Average time between collections

CleanShot 2025-01-27 at 12 39 58@2x

@zmoog
Copy link
Contributor

zmoog commented Jan 27, 2025

Recap table min / max / avg minutes while collecting Microsoft.KeyVault/vaults

Collection period: 1m
Timegrain: PT1M

Resources 8.17.1 PR
50 1 / 2 / 1.4 1 / 2 / 1.2
100 2 / 4 / 2.6 1 / 2 / 1.2
200 4 / 7 / 5.4 1 / 2 / 1.2 (1m on a non-definitions refresh iteration)
400 11 / 13/ 11.6 1 / 3 / 1.4 (1m on a non-definitions refresh iteration)
800 23 / 23 / na 1 / 5 / 1.8 (1m on a non-definitions refresh iteration)

Note: the first iteration in the series is slower due to the need to fetch the list of metric definitions. With 50 resources, the delay only happen on start. With more resource (100+), the delay repeats every 10 mins (the default definitions refresh interval).

UPDATE: I didn't set the enable_batch_api: true in the PR, so we had to re-run the tests! 🤦 Sorry @MichaelKatsoulis!

@MichaelKatsoulis MichaelKatsoulis marked this pull request as ready for review January 27, 2025 12:30
@MichaelKatsoulis
Copy link
Contributor Author

@elastic/beats-tech-leads could you also review this PR as you are code owners?

There are still some pending updates around unit testing but will be ready until tomorrow.

@zmoog zmoog requested review from gizas and constanca-m January 27, 2025 15:10
@MichaelKatsoulis
Copy link
Contributor Author

@cmacknz or @jlind23 could someone review this PR on behalf of beats tech leads? We would like to include it 9.0.0

@MichaelKatsoulis MichaelKatsoulis added the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label Jan 28, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 28, 2025
@MichaelKatsoulis
Copy link
Contributor Author

It is decided with @zmoog and @bturquet that this PR will target 8.18.1 and 9.0.1 release versions. Our goal is to test it more thoroughly , add integration tests and also add support for storage accounts if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants