Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add source metadata RFC #15

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
200 changes: 200 additions & 0 deletions design/0003-source-metadata.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,200 @@
- Contribution Name: Source Metadata for Traffic Metrics
- Implementation Owner: Alex Leong
- Start Date: 2020-04-14
- Target Date:
- RFC PR:
- Linkerd Issue:
- Reviewers:

# Summary

[summary]: #summary

When a meshed pod sends outgoing HTTP requests the Linkerd proxy records metrics
adleong marked this conversation as resolved.
Show resolved Hide resolved
such as latency and request counters and scopes those metrics by the
destination. This is done by setting a `dst_X` label on the Prometheus metrics
where `X` is the destination workload kind. On the other hand, when a meshed
pod receives incoming HTTP requests, there is no equivalent scoping of the
metrics by source. In other words, there is no `src_X` Prometheus label, making
it impossible to break down metrics for incoming traffic by source. This RFC
proposes adding `src_X` Prometheus labels all meshed HTTP traffic.

# Problem Statement (Step 1)

[problem-statement]: #problem-statement

Linkerd is not able to add `src_X` labels today because it simply has no
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I understand this limitation. But can we be clearer about the problem here? What specific features / use cases are we not able to satisfy without these metrics?

Especially because this proposal only includes this metadata on meshed connections (so we're guaranteed to have the client metrics).

What do these labels enable, concretely, that we can't do otherwise? Or if we do not add them, what risks do we incur?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next paragraph talks about the limitations we face from not having these metrics.

knowledge of the source resource. It knows the peer socket address of the
source, but has no mechanism to convert that address into a Kubernetes resource
name or type. This is in contrast to the `dst_X` metadata which the proxy
gets from the Destination controller when doing service discovery look-ups.

This asymmetry in metadata can be very limiting when doing queries. It is
impossible to determine who the clients of a resource are by looking at that
resource's metrics alone. Instead, we need to query the outbound metrics of all
other resource to find a client with the appropriate `dst_X` label. Not only
does this make the query awkward, it also means that resource-to-resource
metrics can only be observed on the client side, never on the server side. This
limits our ability to measure network latency.
Copy link
Member

@olix0r olix0r Apr 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So? Is the problem "this is awkward?" or are there fundamental limitations that we incur here?

I want to be careful about looking at this benefit versus the cost of the implementation...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some explicit examples of questions that we can't answer without source metadata.


More specifically, here are some examples of questions that cannot be answered
without introducing source metadata:

* We cannot compare client-side and server-side metrics for the same traffic to
identify latency or errors introduced between the two Linkerd proxies (e.g. by
the network or by other intermediary proxies)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To put a finer point on this... there's a specific API call for which we return incorrect data, right? What is that API Call?

I think this may be the only problem we're trying to address with this proposal. I'd try to take out much of the other context about labels and asymmetry and focus very concretely on this issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's quite right. I don't think any of our API calls return incorrect data, their behavior is just confusing and non-uniform.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it sounds like its our metrics API that is the problem: that we have no way to distinguish these concepts at all.

To put it another way: if we made the changes proposed here, what impact would that have on our metrics API? Would we just subtly change the semantics of an existing call? Would we be unable to represent this call?

I view Linkerd's prometheus data as diagnostics... like log lines, basically. The format & details may change between releases as long as we satisfy our metrics API, which powers consumers within Linkerd. So, a problem statement phrased in terms of our prometheus data isn't all that compelling.

However, it absolutely does sound like a problem if there are fundamental questions that our API cannot answer about mesh traffic. How do we improve Linkerd's metrics API so that it can properly differentiate client and serverside metrics? If this requires source labels, great, I'm all in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworked the RFC to be API focused.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the updated framing.

If it's alright with you, I'm going to suggest some further edits via PR to contextualize how I see this all fitting together.

* It is difficult to present traffic metrics in a consistent way: top line
resource metrics are measured on the server-side by default, but in order to
view a breakdown of these metrics by source, the absesnse of source metadata
means that we have to switch to displaying client-side metrics. This is
confusing at best and misleading at worst.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still a bit squishy... It's difficult, but we do it. I.e. for consumers of our API, there's no overhead. Right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not present traffic metrics in a consistent way today. We're forced to either hide the distinction between server-side and client-side metrics, which is deceitful, or be explicit that certain queries are measured on one side while other queries are measured on the other, which is confusing. Today we split the difference by being vague. So I see this a problem for consumers of the API.

* For traffic from unmeshed sources, the problem is even worse. In this case
we don't have client-side metrics at all and can't display any metrics for
this traffic. Introducing source metadata would allow us to distinguish
between meshed and unmeshed traffic on the server side. While we would not
be able to distinguish between different unmeshed sources, we would at least
be able to show metrics for traffic from all unmeshed sources aggregated
together.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this RFC actually addresses this issue... We already have client IDs on server metrics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, though we don't use that data for this purpose today. We could use it, but it would be more straightforward to use source metadata. While this would not impact consumers of the API, I still think that having a cleaner implementation is of some value.

However, I have removed this point for now.


Adding source metadata to HTTP traffic metrics would enable improvements in the
Linkerd Grafana dashboard, 3rd party tools that consume Linkerd's Prometheus
metrics, the controller's StatSummary API, and consequently the `linkerd stat`
CLI command and Linkerd dashboard. These improvements are out of the scope of
this proposal.

# Design proposal (Step 2)

[design-proposal]: #design-proposal

We will use the [Downward
adleong marked this conversation as resolved.
Show resolved Hide resolved
API](https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/#the-downward-api)
to create and mount a volume in meshed pods which contains a file with the pod
owner's resource name and kind. This is very similar to [the approach used to
populate pod labels on span
annotations](https://github.com/linkerd/linkerd2/pull/4199). These values can
be extracted from pod labels such as `linkerd.io/proxy-deployment`,
`linkerd.io/proxy-daemonset`, etc. We can create a simple startup script in the
proxy container to preprocess these pod labels into the labels that we would
like the proxy to use. For example, the Downward API would create a volume with
a file containing:

```
app="web-svc"
linkerd.io/control-plane-ns="linkerd"
adleong marked this conversation as resolved.
Show resolved Hide resolved
linkerd.io/proxy-deployment="web"
pod-template-hash="5cb99f85d8"
```

Our startup script would rewrite this to keep only the labels that we want and
to put them in the desired format:

```
resource_name="web"
resource_kind="deployment"
```
adleong marked this conversation as resolved.
Show resolved Hide resolved

The proxy would read this file to get a list of labels to apply to its traffic
metrics. The way the proxy uses these labels differs for the inbound and
outbound stacks.

For the inbound stack, the proxy will prepend the `dst_` prefix to these labels
adleong marked this conversation as resolved.
Show resolved Hide resolved
and use them to scope the inbound metrics. This will result in metrics such as:

```
request_total{
direction="inbound",
dst_resource_name="web",
dst_resource_kind="deployment",
...
}
```

The `dst_` prefix is used like this because on the inbound side, the destination
is the local resource. Source labels are also added to the inbound metrics but
we'll come back to that in a moment.

For the outbound stack, the proxy will prepend the `src_` prefix to the labels
and use them to scope the outbound metrics. The corresponding `dst_` labels
will be populated by the dst metadata from the destination controller. This
will result in metrics such as:

```
request_total{
direction="outbound",
src_resource_name="web",
src_resource_kind="deployment",
dst_resource_name="emoji",
dst_resource_kind="deployment",
...
}
```

The outbound stack will also share the source labels with the inbound stack of
the destination proxy. Remember when we said we'd come back to inbound?

In addition to populating the `dst_` labels, the inbound stack will also use the
source labels, prepend `src_` to them, and add them to the label scope.
Thus, the complete inbound metrics would actually look like:


```
request_total{
direction="inbound",
dst_resource_name="web",
dst_resource_kind="deployment",
src_resource_name="vote-bot",
src_resource_kind="deployment",
}
```

The details of how the source labels will be shared between the source outbound
proxy and the destination inbound proxy are out-of-scope of this RFC.

Note that all of the changes described here are additive to the existing
Prometheus labels and would not introduce any backwards incompatibility.
adleong marked this conversation as resolved.
Show resolved Hide resolved

This change does increase the cardinality of Prometheus time-series since
inbound metrics will now be scoped by source resource. This will roughly bring
the cardinality of the inbound metrics to the same as the outbound metrics.

Note also that this implementation means that source metadata will NOT be
available when the source resource is not meshed.

There are no known blockers or prerequisites before this work can be started.

# Prior art

[prior-art]: #prior-art

The Istio mixer telemetry architecture showcases a different approach to
populating source metadata. In that architecture, rather than having data plane
proxies expose metrics to Prometheus directly, the metrics are ingested by the
control plane and post-processed. Source IP addresses are resolved to source
resources before being stored.

It would be difficult to incorporate this approach in Linkerd without
introducing significant complexity. The fact that Prometheus is able to scrape
Linkerd proxies directly is a great property. Alternatively, having the Linkerd
proxy resolve IP addresses to resources would either require an API call which
would introduce latency in the critical data path or would require an in-process
cache which would increase the proxy memory footprint.

In Istio configuration which do not use mixer, source context is communicated
through a base64 encoded map of key-value pairs which in an HTTP header. This
approach requires reading and decoding this header on a per-request basis, even
though we know that the source metadata will be the same for all requests on a
single connection. By communicating source metadata at the connection level,
we can avoid doing this work for each request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The non-mixer istio actually does this differently still. Have you taken a peek at how they're doing it now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't. I'll check it out.

# Unresolved questions

[unresolved-questions]: #unresolved-questions

The details of how to share source metadata will be discussed in another RFC.

# Future possibilities

[future-possibilities]: #future-possibilities

Make use of these new metrics in the Grafana dashboards, StatSummary API, Linkerd
CLI and Linkerd dashboard.