You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to answer the question:
For a given proxy/pod/workload, what are the inbound metrics, grouped by client?
Additionally, per the Prometheus docs, a single metric should always be exported with the same set of labels. We currently do not satisfy this requirement (e.g. dst_deployment is only present on outbound metrics).
How should the problem be solved?
For inbound metrics, introduce new labels:
src_deployment (new data)
src_namespace (new data)
dst_deployment (copy of deployment)
dst_namespace (copy of namespace)
For outbound metrics, introduce new labels:
src_deployment (copy of deployment)
src_namespace (copy of namespace)
Notes:
Enables grouping by client, when looking at a single proxy.
Makes the set of labels consistent for a given metric.
Backwards compatible with existing Linkerd tooling.
Applies to request_total and response_total metrics.
Applies to workload types (statefulset, daemonset, etc).
Today, we can query outbound metrics, with dst_ labels set to what we want to measure inbound for:
sum(
response_total{
direction="outbound"
dst_namespace="linkerd",
dst_deployment="linkerd-controller",
}
) by (deployment, namespace)
Drawbacks to this approach:
This query must be done against Prometheus. This data is not attainable from a single Linkerd proxy.
It's difficult to reconcile these metrics with the inbound metrics of the proxy or workload we are interested in.
How would users interact with this feature?
Existing Linkerd CLIs / UI would not need to change, but this new data would allow simplification of linkerd stat queries in the CLI, dashboard, and Grafana. Users would also have better visibility into an unhealthy pod/workload's client behavior.
This query must be done against Prometheus. This data is not attainable from a single Linkerd proxy.
(This may not be a important, so take it with a pinch of salt :p )
I've raised a similar issue on the proxy not knowing the metadata i.e #4008. If we solve that, I guess we don't have to rely on Prometheus and lets us get the same data from each individual proxy too.
Feature Request
What problem are you trying to solve?
I would like to answer the question:
For a given proxy/pod/workload, what are the inbound metrics, grouped by client?
Additionally, per the Prometheus docs, a single metric should always be exported with the same set of labels. We currently do not satisfy this requirement (e.g.
dst_deployment
is only present onoutbound
metrics).How should the problem be solved?
For
inbound
metrics, introduce new labels:src_deployment
(new data)src_namespace
(new data)dst_deployment
(copy ofdeployment
)dst_namespace
(copy ofnamespace
)For
outbound
metrics, introduce new labels:src_deployment
(copy ofdeployment
)src_namespace
(copy ofnamespace
)Notes:
request_total
andresponse_total
metrics.statefulset
,daemonset
, etc).Current example
Proposed example
Any alternatives you've considered?
Today, we can query
outbound
metrics, withdst_
labels set to what we want to measure inbound for:Drawbacks to this approach:
inbound
metrics of the proxy or workload we are interested in.How would users interact with this feature?
Existing Linkerd CLIs / UI would not need to change, but this new data would allow simplification of
linkerd stat
queries in the CLI, dashboard, and Grafana. Users would also have better visibility into an unhealthy pod/workload's client behavior.Relates to #4102.
/cc @adleong @grampelberg
The text was updated successfully, but these errors were encountered: