Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2.7.1 crash with v2.6.1 config, need assistance #4732

Open
StefanSa opened this issue Feb 21, 2025 · 7 comments
Open

v2.7.1 crash with v2.6.1 config, need assistance #4732

StefanSa opened this issue Feb 21, 2025 · 7 comments

Comments

@StefanSa
Copy link

Describe the bug
I am running v2.6.1 with this configuration without any problems.

# For more information on this configuration, see the complete reference guide at
# https://grafana.com/docs/tempo/latest/configuration/

stream_over_http_enabled: true

multitenancy_enabled: false
usage_report:
  reporting_enabled: false

compactor:
  compaction:
    block_retention: 1h

distributor:
  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: 0.0.0.0:4317
        http:
          endpoint: 0.0.0.0:4318

ingester:
  trace_idle_period: 10s
  max_block_bytes: 1_000_000
  max_block_duration: 5m

#querier:
#  frontend_worker:
#    frontend_address: tempo:9095

server:
  http_listen_port: 3200
  grpc_listen_port: 9095
  log_level: info

metrics_generator:
  processor:
    span_metrics:
      # Configure extra dimensions to add as metric labels.
      dimensions:
      - http.method
      - http.target
      - http.status_code
      - service.version
    # Service graph metrics create node and edge metrics for determinng service interactions.
    service_graphs:
      # Configure extra dimensions to add as metric labels.
      dimensions:
      - http.method
      - http.target
      - http.status_code
      - service.version
  storage:
    path: /tmp/tempo/generator/wal
    remote_write_add_org_id_header: true
    remote_write:
    - url: http://gateway:8080/api/v1/push
      send_exemplars: true
      send_native_histograms: true
      headers:
        X-Scope-OrgID: "anonymous"
  traces_storage:
    path: /tmp/tempo/generator/traces

storage:
  trace:
    backend: s3
    wal:
      path: /tmp/tempo/wal
    s3:
      bucket: tempo-data
      endpoint: 172.17.34.65:9000
      access_key: ${TEMPO_S3_ACCESS_KEY:-lgtmp}
      secret_key: ${TEMPO_S3_SECRET_KEY:-supersecret}
      insecure: ${TEMPO_S3_INSECURE:-true}
      tls_insecure_skip_verify: true

# https://github.com/grafana/tempo/blob/main/docs/sources/tempo/configuration/_index.md#cache
cache:
  background:
    writeback_goroutines: 5
  caches:
  - roles:
    - bloom
    - parquet-footer
    - parquet-page
    - frontend-search
    - parquet-column-idx
    - parquet-offset-idx
    memcached:
      addresses: "dns+memcached:11211"

# Global override configuration.
overrides:
  per_tenant_override_config: /etc/tempo/configs/overrides.yaml
  defaults:
    metrics_generator:
      processors:
      - service-graphs
      - span-metrics
      - local-blocks

the container v2.7.1 starts without any problems and restarts without any visible error message after 15sec.
Any idea what is wrong with this configuration?

thx
StefanS

@StefanSa StefanSa changed the title v2.7.1 chrash with v2.6.1, need assistance v2.7.1 chrash with v2.6.1 config, need assistance Feb 21, 2025
@joe-elliott
Copy link
Member

Is it panicking? exiting? OOMing? Any logs?

@StefanSa
Copy link
Author

Hi joe,
Unfortunately no, or so short that i don't notice it.
I look live with portainer into the container and do a permanent refresh, at some point it does a reboot.

@StefanSa
Copy link
Author

i use this entry point to store the secrets in a env variable.

#!/bin/sh

export TEMPO_S3_ACCESS_KEY=$(cat /run/secrets/minio_user)
export TEMPO_S3_SECRET_KEY=$(cat /run/secrets/minio_secret)
/tempo $@

@StefanSa
Copy link
Author

StefanSa commented Feb 21, 2025

@joe-elliott
I think i have found it.
v2.7.1 has problems with memcache, here is the error message.

level=error ts=2025-02-21T16:28:26.901106344Z caller=memcached.go:143 msg="failed to put to memcached" name=bloom|parquet-footer|parquet-page|frontend-search|parquet-column-idx|parquet-offset-idx err="memcache: no servers configured or available"

but v2.6.1 without error:

level=info ts=2025-02-21T16:34:09.357093011Z caller=cache.go:46 msg="configuring memcached client" roles=bloom|parquet-footer|parquet-page|frontend-search|parquet-column-idx|parquet-offset-idx

level=warn ts=2025-02-21T16:34:09.357208695Z caller=experimental.go:19 msg="experimental feature in use" feature="DNS-based memcached service discovery"

Has something changed in the configuration or is this a bug with memcache?

@StefanSa StefanSa changed the title v2.7.1 chrash with v2.6.1 config, need assistance v2.7.1 chrash with v2.6.1 config, need assistance, error failed to put to memcached Feb 21, 2025
@joe-elliott joe-elliott changed the title v2.7.1 chrash with v2.6.1 config, need assistance, error failed to put to memcached v2.7.1 crash with v2.6.1 config, need assistance, error failed to put to memcached Feb 21, 2025
@joe-elliott
Copy link
Member

"failed to put to memcached" errors won't cause tempo to exit. it's just there so you know memcached is misconfigured. we recently improved docs on that here:

#4695

@StefanSa
Copy link
Author

@joe-elliott
Here is the main mistake
i am doing a health check that worked with v2.6.1 so far.

- wget --quiet --tries=1 --output-document=- http://localhost:3200/ready | grep -q -w ready || exit 1

With v2.7.1 i now get this error message back 503 Service Unavailable, that was the reason for restarting the container.
Does \ready no longer exist in v2.7.1, or how can i do a health check?

@StefanSa StefanSa changed the title v2.7.1 crash with v2.6.1 config, need assistance, error failed to put to memcached v2.7.1 crash with v2.6.1 config, need assistance Feb 22, 2025
@Hu1buerger
Copy link

Hu1buerger commented Feb 24, 2025

TL;DR: try to remove the metrics_generator.traces_storage from your config and start the process again.

@StefanSa take a look at #4742;

Is it still panicing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants