Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serve metrics via client_golang #2

Merged
merged 1 commit into from
Apr 19, 2021
Merged

Serve metrics via client_golang #2

merged 1 commit into from
Apr 19, 2021

Conversation

siggy
Copy link
Member

@siggy siggy commented Apr 16, 2021

Switch to exporting metrics using the Prometheus client_golang library,
to help ensure we conform to Prometheus metrics best practices. This
change also decreases scrape latency and memory usage.

Summary of changes:

  • generate metrics values out-of-band from scrape
  • remove duplicate time series from /metrics
  • add histogram _sum and _count metrics
  • ensure consistent labels per metric
  • deterministic label ordering per timeseries
  • HELP and TYPE metadata support

Signed-off-by: Andrew Seigner [email protected]

misc testing

main

local

$ time curl -s -o /dev/null http://localhost:4191/metrics

real	0m3.282s
user	0m0.294s
sys	0m0.203s

this branch

local

$ time curl -s -o /dev/null http://localhost:4191/metrics

real	0m1.227s
user	0m0.147s
sys	0m0.189s

resource usage in AKS, with 40 agents

$ kubectl --context aks-load-test -n fauxmetheus top pods
NAME                           CPU(cores)   MEMORY(bytes)
fauxmetheus-5d5f4d4568-d4ww5   2060m        2521Mi

successful concurrent scrapes in AKS

$ curl -s localhost:4191/metrics|grep promhttp_metric_handler_requests
# HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
# TYPE promhttp_metric_handler_requests_in_flight gauge
promhttp_metric_handler_requests_in_flight 49
# HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
# TYPE promhttp_metric_handler_requests_total counter
promhttp_metric_handler_requests_total{code="200"} 5578
promhttp_metric_handler_requests_total{code="500"} 0
promhttp_metric_handler_requests_total{code="503"} 0

Switch to exporting metrics using the Prometheus client_golang library,
to help ensure we conform to Prometheus metrics best practices. This
change also decreases scrape latency and memory usage.

Summary of changes:
- generate metrics values out-of-band from scrape
- remove duplicate time series from /metrics
- add histogram `_sum` and `_count` metrics
- ensure consistent labels per metric
- deterministic label ordering per timeseries
- `HELP` and `TYPE` metadata support

Signed-off-by: Andrew Seigner <[email protected]>
@siggy siggy requested review from alpeb and adleong April 16, 2021 07:10
@siggy siggy self-assigned this Apr 16, 2021
Copy link
Member

@adleong adleong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

"dst_workload_name": target,

// inbound
"client_id": "",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be removed? I believe the client_id label is only used for inbound metrics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the client_golang library will fail with a runtime error if we remove this, as labels should never be optional within a single metric.

we're caught between conforming to prom best practices (and the emerging openmetrics spec) and reproducing the proxy's output. eventually i'd expect the proxy to conform to openmetrics, part of that work is documented in linkerd/linkerd2#4102.

@siggy siggy merged commit 8ac073d into main Apr 19, 2021
@siggy siggy deleted the siggy/prom branch April 19, 2021 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants