Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inbound: Fix gRPC response classification #2496

Merged
merged 5 commits into from
Nov 2, 2023
Merged

inbound: Fix gRPC response classification #2496

merged 5 commits into from
Nov 2, 2023

Conversation

olix0r
Copy link
Member

@olix0r olix0r commented Nov 1, 2023

ecaaf39 changed the proxy's behavior with regard to creating default response classifiers: the defaults used to support detecting gRPC response (regardless of the request properties).

To fix this, we modify the metrics module that uses responses classifiers to require them without inferring defaults. This enforces the intended usage pattern so that we do not silently and implicitly fall back to the default behavior. This requirement is only strictly enforced in development builds. In release builds, the prior defaulting behavior remains in place. This minimizes the difference from the existing behavior, especially if we hit untested cases where a hard failure would be untenable (especially in metrics code).

This change also updates the NewClassify module that inserts the response classifier request extension so that overrides are supported. We then can install a default classifier early in request processing and override it only if specified by a route configuration.

To support this change, the http-metrics crate is updated to support querying response_total metrics without stringifying everything.

ecaaf39 changed the proxy's behavior with regard to creating [default
response classifiers][default]: the defaults used to support detecting
gRPC response (regardless of the request properties).

To fix this, we modify the metrics module that uses responses
classifiers to *require* them without inferring defaults. This enforces
the intended usage pattern so that we do not silently and implicitly
fall back to the default behavior.

This change also updates the `NewClassify` module that inserts the
response classifier request extension so that overrides are supported.
We then can install a default classifier early in request processing and
override it only if specified by a route configuration.

To support this change, the http-metrics crate is updated to support
querying response_total metrics without stringifying everything.

[default]: ecaaf39#diff-372e8a8a57b1fad5d94f37d2f77fdc7a45bcf708782475424b75d671f99ea1a0L97-L103
@olix0r olix0r requested a review from a team as a code owner November 1, 2023 00:30
Copy link
Contributor

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change looks good to me! good call on including debug assertions to help catch regressions in the tests but not take down the whole proxy in prod!

linkerd/app/inbound/src/http/tests.rs Outdated Show resolved Hide resolved
linkerd/http-metrics/src/lib.rs Show resolved Hide resolved
Comment on lines +70 to +71
if req.extensions_mut().insert(classify_rsp).is_some() {
tracing::debug!("Overrode response classifier");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, take it or leave it: maybe worth having this print what the previous classifier was?

Suggested change
if req.extensions_mut().insert(classify_rsp).is_some() {
tracing::debug!("Overrode response classifier");
if let Some(prev) = req.extensions_mut().insert(classify_rsp) {
tracing::debug!(?prev, "Overrode response classifier");

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that requires adding Debug bounds which felt a little larger than desired.

linkerd/http-metrics/src/requests/service.rs Outdated Show resolved Hide resolved
@olix0r olix0r merged commit cbf226e into main Nov 2, 2023
15 checks passed
@olix0r olix0r deleted the ver/class branch November 2, 2023 00:41
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Nov 2, 2023
This release includes several bugfixes. Notably, inbound proxies would
not properly reflect grpc-status in metrics by default.

Furthermore, proxies now long warnings when they receive unexpected
error responses from the control plane.

---

* chore: change `rust-toolchain` file to toml format (linkerd/linkerd2-proxy#2487)
* gate: Detect disconnected inner services in readiness (linkerd/linkerd2-proxy#2491)
* Bump ahash to v0.8.5 (linkerd/linkerd2-proxy#2498)
* gate: Fix readiness deadlock (linkerd/linkerd2-proxy#2493)
* Log a warning when the controller clients receive an error (linkerd/linkerd2-proxy#2499)
* inbound: Fix gRPC response classification (linkerd/linkerd2-proxy#2496)

Signed-off-by: Oliver Gould <[email protected]>
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Nov 2, 2023
This release includes several bugfixes. Notably, inbound proxies would
not properly reflect grpc-status in metrics by default.

Furthermore, proxies now long warnings when they receive unexpected
error responses from the control plane.

---

* chore: change `rust-toolchain` file to toml format (linkerd/linkerd2-proxy#2487)
* gate: Detect disconnected inner services in readiness (linkerd/linkerd2-proxy#2491)
* Bump ahash to v0.8.5 (linkerd/linkerd2-proxy#2498)
* gate: Fix readiness deadlock (linkerd/linkerd2-proxy#2493)
* Log a warning when the controller clients receive an error (linkerd/linkerd2-proxy#2499)
* inbound: Fix gRPC response classification (linkerd/linkerd2-proxy#2496)

Signed-off-by: Oliver Gould <[email protected]>
adleong pushed a commit that referenced this pull request Nov 16, 2023
ecaaf39 changed the proxy's behavior with regard to creating [default
response classifiers][default]: the defaults used to support detecting
gRPC response (regardless of the request properties).

To fix this, we modify the metrics module that uses responses
classifiers to *require* them without inferring defaults. This enforces
the intended usage pattern so that we do not silently and implicitly
fall back to the default behavior.

This change also updates the `NewClassify` module that inserts the
response classifier request extension so that overrides are supported.
We then can install a default classifier early in request processing and
override it only if specified by a route configuration.

To support this change, the http-metrics crate is updated to support
querying response_total metrics without stringifying everything.

[default]: ecaaf39#diff-372e8a8a57b1fad5d94f37d2f77fdc7a45bcf708782475424b75d671f99ea1a0L97-L103
@adleong adleong mentioned this pull request Nov 16, 2023
adleong added a commit that referenced this pull request Nov 16, 2023
* Include server address in server error logs (#2500)
* inbound: Fix gRPC response classification (#2496)
* Bump ahash to v0.8.5 (#2498)
* Allow BSD-2-Clause

---------

Co-authored-by: Oliver Gould <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants