Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SingleRequestOrResponseApiTest failure #2399

Closed
Scottmitch opened this issue Oct 16, 2022 · 16 comments · Fixed by #3064
Closed

SingleRequestOrResponseApiTest failure #2399

Scottmitch opened this issue Oct 16, 2022 · 16 comments · Fixed by #3064
Labels
flaky tests Unit tests are flaky

Comments

@Scottmitch
Copy link
Member

https://github.com/apple/servicetalk/actions/runs/3260145231/jobs/5353531124

SingleRequestOrResponseApiTest.streamingService=false, streamingClient=false
java.lang.AssertionError: 
Expected: is <INVALID_ARGUMENT>
     but: was <UNKNOWN>

java.lang.AssertionError: 
Expected: is <INVALID_ARGUMENT>
     but: was <UNKNOWN>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsWithInvalidArgument(SingleRequestOrResponseApiTest.java:140)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsOnSecondRequestItem(SingleRequestOrResponseApiTest.java:130)

@idelpivnitskiy
Copy link
Member

https://github.com/apple/servicetalk/actions/runs/3473606914/jobs/5807562170

SingleRequestOrResponseApiTest.streamingService=false, streamingClient=false
java.lang.AssertionError: 
Expected: is <INVALID_ARGUMENT>
     but: was <CANCELLED>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsWithInvalidArgument(SingleRequestOrResponseApiTest.java:140)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsOnSecondRequestItem(SingleRequestOrResponseApiTest.java:130)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

@daschl
Copy link
Contributor

daschl commented Nov 30, 2022

https://github.com/apple/servicetalk/pull/2448/checks?check_run_id=9804682299

SingleRequestOrResponseApiTest.streamingService=false, streamingClient=false
java.lang.AssertionError: 
Expected: is <INVALID_ARGUMENT>
     but: was <CANCELLED>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsWithInvalidArgument(SingleRequestOrResponseApiTest.java:140)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsOnSecondRequestItem(SingleRequestOrResponseApiTest.java:130)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
	at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
	at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)

@idelpivnitskiy
Copy link
Member

https://github.com/apple/servicetalk/pull/2487/checks?check_run_id=10612328557

SingleRequestOrResponseApiTest.streamingService=false, streamingClient=false
java.lang.AssertionError: 
Expected: is <INVALID_ARGUMENT>
     but: was <CANCELLED>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsWithInvalidArgument(SingleRequestOrResponseApiTest.java:140)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsOnSecondRequestItem(SingleRequestOrResponseApiTest.java:130)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)

@Scottmitch
Copy link
Member Author

Scottmitch commented Mar 7, 2023

https://github.com/apple/servicetalk/actions/runs/4358111113/jobs/7618265865

SingleRequestOrResponseApiTest > serverResponseStreamingRouteFailsOnSecondRequestItem(boolean, boolean) > io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsOnSecondRequestItem(boolean, boolean)[1] FAILED
    java.lang.AssertionError: 
    Expected: is <INVALID_ARGUMENT>
         but: was <CANCELLED>
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
        at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsWithInvalidArgument(SingleRequestOrResponseApiTest.java:140)
        at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsOnSecondRequestItem(SingleRequestOrResponseApiTest.java:130)

@idelpivnitskiy
Copy link
Member

https://github.com/apple/servicetalk/pull/2567/checks?check_run_id=13056028248

SingleRequestOrResponseApiTest.streamingService=false, streamingClient=false
java.lang.AssertionError: 
Expected: is <INVALID_ARGUMENT>
     but: was <CANCELLED>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsWithInvalidArgument(SingleRequestOrResponseApiTest.java:140)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsOnSecondRequestItem(SingleRequestOrResponseApiTest.java:130)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)

@idelpivnitskiy
Copy link
Member

https://github.com/apple/servicetalk/pull/2691/checks?check_run_id=16739938717

java.lang.AssertionError: 
Expected: is <INVALID_ARGUMENT>
     but: was <CANCELLED>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsWithInvalidArgument(SingleRequestOrResponseApiTest.java:140)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsOnSecondRequestItem(SingleRequestOrResponseApiTest.java:130)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)

@Scottmitch
Copy link
Member Author

https://github.com/apple/servicetalk/runs/17377699234

SingleRequestOrResponseApiTest.streamingService=false, streamingClient=false

java.lang.AssertionError: 
Expected: is <INVALID_ARGUMENT>
     but: was <CANCELLED>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsWithInvalidArgument(SingleRequestOrResponseApiTest.java:140)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsOnSecondRequestItem(SingleRequestOrResponseApiTest.java:130)

@idelpivnitskiy idelpivnitskiy added the flaky tests Unit tests are flaky label Oct 25, 2023
@idelpivnitskiy
Copy link
Member

https://github.com/apple/servicetalk/pull/2847/checks?check_run_id=21796187561

SingleRequestOrResponseApiTest.streamingService=false, streamingClient=false
java.lang.AssertionError: 
Expected: is <INVALID_ARGUMENT>
     but: was <CANCELLED>
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsWithInvalidArgument(SingleRequestOrResponseApiTest.java:140)
	at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsOnSecondRequestItem(SingleRequestOrResponseApiTest.java:130)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)

@Scottmitch
Copy link
Member Author

https://github.com/apple/servicetalk/actions/runs/10933717951/job/30353131351?pr=3058#step:8:1388

> Task :servicetalk-grpc-netty:test

SingleRequestOrResponseApiTest > serverResponseStreamingRouteFailsOnSecondRequestItem(boolean, boolean) > streamingService=false, streamingClient=false FAILED
    java.lang.AssertionError: 
    Expected: is <INVALID_ARGUMENT>
         but: was <CANCELLED>
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
        at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsWithInvalidArgument(SingleRequestOrResponseApiTest.java:140)
        at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsOnSecondRequestItem(SingleRequestOrResponseApiTest.java:130)

@Scottmitch
Copy link
Member Author

https://github.com/apple/servicetalk/actions/runs/10959243976/job/30439969588?pr=3061#step:8:873

> Task :servicetalk-grpc-netty:test

SingleRequestOrResponseApiTest > serverResponseStreamingRouteFailsOnSecondRequestItem(boolean, boolean) > streamingService=false, streamingClient=false FAILED
    java.lang.AssertionError: 
    Expected: is <INVALID_ARGUMENT>
         but: was <CANCELLED>
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
        at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsWithInvalidArgument(SingleRequestOrResponseApiTest.java:140)
        at io.servicetalk.grpc.netty.SingleRequestOrResponseApiTest.serverResponseStreamingRouteFailsOnSecondRequestItem(SingleRequestOrResponseApiTest.java:130)

@bryce-anderson
Copy link
Contributor

Receiving CANCEL instead of the INVALID_ARGUMENT looks like a race between turning an exception into a response and closing the stream. The place where the exception appears to be generated is in this block here. We see there that we're using the try with resources pattern to hold the iterator and throwing an exception if there are too many messages. Closing the iterator may be leading to the channel being closed (which will surface as a RST_STREAM(CANCEL) to the client) before the exception has been properly turned into a response and rendered. If we switch the try-with-resources to a try/catch and delay closing the iterator the flakiness appears to alleviate, supporting this hypothesis.

@@ -781,7 +783,9 @@ final class GrpcRouter {
 
                         private Req firstItem(final BlockingIterable<Req> request) throws Exception {
                             final Req firstItem;
-                            try (BlockingIterator<Req> requestIterator = request.iterator()) {
+                            // TODO: since we're closing the request iterator after we exit this block, is that triggering the cancel?
+                            BlockingIterator<Req> requestIterator = request.iterator();
+                            try {
                                 if (!requestIterator.hasNext()) {
                                     throw new GrpcStatusException(new GrpcStatus(INVALID_ARGUMENT,
                                             SINGLE_MESSAGE_EXPECTED_NONE_RECEIVED_MSG));
@@ -794,6 +798,16 @@ final class GrpcRouter {
                                     throw new GrpcStatusException(new GrpcStatus(INVALID_ARGUMENT,
                                             MORE_THAN_ONE_MESSAGE_RECEIVED_MSG));
                                 }
+                            } catch (Throwable t) {
+                                // delay closing to give the exception time to turn into a response.
+                                Executors.global().schedule(() -> {
+                                    try {
+                                        requestIterator.close();
+                                    } catch (Exception e) {
+                                        // swallow.
+                                    }
+                                }, 100, TimeUnit.MILLISECONDS);
+                                throw t;
                             }
                             return firstItem;

@idelpivnitskiy
Copy link
Member

@bryce-anderson thanks for digging into that!
It's still surprising that another exception gets priority over INVALID_ARGUMENT thrown from the try block. As per java spec (https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html), whatever is thrown by "try-with-resources statement" ends up being a suppressed exception of the original one thrown from the try block. Did you understand where CANCEL comes from in this case?

@bryce-anderson
Copy link
Contributor

bryce-anderson commented Sep 24, 2024

@idelpivnitskiy, it looks like the cancel is coming from the BlockingIterator.close() call, implicit from the try-with-resources block, it's not being thrown. I suspect there is more to it than just this: the test is only flaky ~1/2k runs for me locally, and I'd expect a lot more failures if it were simply the iterator.close() call racing.

@idelpivnitskiy
Copy link
Member

I'm agreed that close() is likely the source of cancellation, just don't see how it's possible for a control flow to race with explicitly thrown exception from the inside try block 🤔

@bryce-anderson
Copy link
Contributor

bryce-anderson commented Sep 25, 2024

@idelpivnitskiy and I walked through this a bit more. The code I pointed out above is in the server side, in case that isn't clear. What always happens first is that the requestIterator.close() call is triggered, but most times the inbound publisher has completed so the resulting publisher.cancel() rarely flows down to this NettyChannelPublisher block here. However, in the flaky cases it does get there, perhaps because the final DATA(size=0, fin=true) packet hasn't been processed so we haven't yet short circuited the cancel() method, and we get a hard close of the channel resulting in the CANCEL.

Now how to fix it is challenging. Technically we always call close before rendering the INVALID_ARGUMENT response, it's just that most often that ends up being a no-op as described above. Even if we didn't, using the synchronous API's to write the response and then close would be brittle since we don't get guarantees about flushing the response and the channel close will always be eager.

Possible solutions

  • One solution is to pitch this all back up to the async layer but that will involve filters etc.
  • Another solution is to simply accept the CANCEL result in the test: this test is trying to guard against invalid client implementations since the gRPC API generation should prevent users from sending multiple requests. What we really want from the server implementation is to ensure is that the server doesn't attempt to process such a flawed the request flow, which we do, it's just a matter of how it surfaces to the client. In most cases we do the optimal thing and return an ILLEGAL_ARGUMENT status while rarely we get a CANCEL exception, which while less helpful for debugging, is also reasonable result imo.

Side note

This also all connects back to the question of whether we should close the whole stream on cancelling the inbound publisher: I don't think we should but that opens the door to stream leaks for people who have failed to properly close the outbound publisher (or it's writer representation for the synchronous API's).

@idelpivnitskiy
Copy link
Member

💯 I'm agreed with relaxing the test assertions.
Thanks a lot for summarizing all details!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky tests Unit tests are flaky
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants