Add source labels to inbound metrics #4101

siggy · 2020-02-26T02:32:34Z

Feature Request

What problem are you trying to solve?

I would like to answer the question:
For a given proxy/pod/workload, what are the inbound metrics, grouped by client?

Additionally, per the Prometheus docs, a single metric should always be exported with the same set of labels. We currently do not satisfy this requirement (e.g. dst_deployment is only present on outbound metrics).

How should the problem be solved?

For inbound metrics, introduce new labels:

src_deployment (new data)
src_namespace (new data)
dst_deployment (copy of deployment)
dst_namespace (copy of namespace)

For outbound metrics, introduce new labels:

src_deployment (copy of deployment)
src_namespace (copy of namespace)

Notes:

Enables grouping by client, when looking at a single proxy.
Makes the set of labels consistent for a given metric.
Backwards compatible with existing Linkerd tooling.
Applies to request_total and response_total metrics.
Applies to workload types (statefulset, daemonset, etc).

Current example

response_total{
  direction="inbound",
  namespace="linkerd"
  deployment="linkerd-controller",
}
response_total{
  direction="outbound",
  namespace="linkerd"
  deployment="linkerd-web",
  dst_namespace="linkerd",
  dst_deployment="linkerd-controller",
}

Proposed example

response_total{
  direction="inbound",
  src_namespace="linkerd"
  src_deployment="linkerd-web",
  dst_namespace="linkerd"
  dst_deployment="linkerd-controller",
}
response_total{
  direction="outbound",
  src_namespace="linkerd"
  src_deployment="linkerd-web",
  dst_namespace="linkerd",
  dst_deployment="linkerd-controller",
}

Any alternatives you've considered?

Today, we can query outbound metrics, with dst_ labels set to what we want to measure inbound for:

sum(
  response_total{
    direction="outbound"
    dst_namespace="linkerd",
    dst_deployment="linkerd-controller",
  }
) by (deployment, namespace)

Drawbacks to this approach:

This query must be done against Prometheus. This data is not attainable from a single Linkerd proxy.
It's difficult to reconcile these metrics with the inbound metrics of the proxy or workload we are interested in.

How would users interact with this feature?

Existing Linkerd CLIs / UI would not need to change, but this new data would allow simplification of linkerd stat queries in the CLI, dashboard, and Grafana. Users would also have better visibility into an unhealthy pod/workload's client behavior.

Relates to #4102.

/cc @adleong @grampelberg

The text was updated successfully, but these errors were encountered:

Pothulapati · 2020-02-26T15:50:57Z

Ah! This is important
On,

This query must be done against Prometheus. This data is not attainable from a single Linkerd proxy.

(This may not be a important, so take it with a pinch of salt :p )
I've raised a similar issue on the proxy not knowing the metadata i.e #4008. If we solve that, I guess we don't have to rely on Prometheus and lets us get the same data from each individual proxy too.

siggy added the rfc label Feb 26, 2020

siggy mentioned this issue Feb 26, 2020

Make the set of labels consistent within each metric exported by Linkerd #4102

Open

siggy mentioned this issue Apr 15, 2020

Add source metadata RFC linkerd/rfc#15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add source labels to inbound metrics #4101

Add source labels to inbound metrics #4101

siggy commented Feb 26, 2020 •

edited

Loading

Pothulapati commented Feb 26, 2020

Add source labels to inbound metrics #4101

Add source labels to inbound metrics #4101

Comments

siggy commented Feb 26, 2020 • edited Loading

Feature Request

What problem are you trying to solve?

How should the problem be solved?

Current example

Proposed example

Any alternatives you've considered?

How would users interact with this feature?

Pothulapati commented Feb 26, 2020

siggy commented Feb 26, 2020 •

edited

Loading