-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ByteBufferPool - all ongoing requests fail when single request is cancelled on HTTP/2 #12776
Comments
Thanks for filing this issue, as well as the reproducing instructions. I managed to reproduce the issue quite easily with those extra instructions:
I can reproduce the problem on Jetty 11.0.20 as well as on 12.0.16 with the I'm now busy trying to understand what's going on. Stay tuned! |
Thank you very much for looking at the issue. Your observation match ours - it occurs in Firefox less frequently than in Chromium (but sometimes it still occurs). I tried to reproduce it with your extra instructions but I was not successful when accessing over localhost. When I seek the video rapidly I've only got "cancelled" requests for /video (which is totally fine) and SSE connection stays on without failure. When I try same thing over internet I am getting ERROR_HTTP2_PROTOCOL_ERROR. In the case of accessing over the internet sometimes you don't even have to start seeking video and it fails by itself. Just to clarify: is is not only about SSE failling when there are other regular requests processed at the moment they fail as well. Another observation is that the issue sometimes occurs as well when you open another tab with direct URL to video. When video is served by DefaultServlet or ResourceHandler (with setAcceptRanges(true)). I tried to keep "reproduction example" as small as possible, so these tests are not included, but we tried this with the same results. |
@ivosek In your reproducer's HTTP2ServerConnectionFactory h2 = new HTTP2ServerConnectionFactory(httpsConfig); could you please try adding the following extra line right after it: h2.setRateControlFactory(new WindowRateControl.Factory(1024)); then try again to see if that helps? |
@lorban adding the line seems to have no effect on our issue The tester (with the line added) now runs here: https://13.79.237.171:9443/ When I tried it it failed with first click on seek bar. |
I tried your server and I could not get an error using Chromium, but I could using Firefox. I managed to get error Unfortunately even a sniffer trace did not help explaining why Firefox decided to send a GOAWAY frame, nor why it decided that the SSE stream was faulty. The next step of this investigation would be for you to reproduce the issue while collecting debug logs on the server. Since you suspect a buffer corruption issue, everything under Would it be possible for you to configure logback/log4j/your-fave-logger to collect Jetty's debug logs into a file, compress it then post it here? Thanks! |
Of course, no problem. Here is the jetty.log.gz Log is not so big after all, I was able to reproduce it few seconds after Jetty start. Please let me know if you need to switch some loggers to TRACE level or if you need any other assistence. I kept server running (on log level INFO) in case you would like to do other experiments. Modified source code of tester is in this repo |
… flusher might still reference it Signed-off-by: Ludovic Orban <[email protected]>
… flusher might still reference it, reverting #11527 Signed-off-by: Ludovic Orban <[email protected]>
Here is a quick update about where we stand with this issue. Regarding Jetty 11, this is a legit bug caused by #11527 as you identified: that change is safe for HTTP 1.x but not for HTTP 2. It is fairly easy to reproduce and since it's a small change I've created a pull request for the 10/11 versions (which may or may not end up being merged) despite the fact that they're EOL. Regarding Jetty 12, the buffer removal upon error mechanism was not removed from 12.0.x, so this is a different problem than Jetty 11. It's also much harder to reproduce but has similar symptoms when it triggers, plus the fact that configuring the non-pooling pool on the server makes the problem go away definitely indicates that there is a buffer corruption bug somewhere. More digging is needed here. |
@lorban thank you for the update. Should we put more effort in trying to make it reproducible on the localhost? Would it help you? I have another observation, which support hypothesis of buffering issue. When your internet connection is fast, it is harder to reproduce the issue (it fails once in 5 tries), when your connection is something like 20 Mbps, it fails every time. Maybe some throttling of connection speed in browser would help to make it easier to reproduce. |
@ivosek I consider Jetty 11 sorted, so I assume you're speaking about Jetty 12 when you're explaining that a low internet bandwidth helps reproducing the issue. There is a throttling setting in Chromium's developer tools (Network tab > No Throttling next to Disable Cache) where you can easily limit the bandwidth. It helped me reproducing the issue on Jetty 11 very easily (once every 2 or 3 tries) but it's still very hard on Jetty 12 (once every 100 tries maybe). If I could reproduce the problem more easily I could find the root cause a lot faster because unfortunately the debug logs don't clearly show where the problem is, for Jetty 11 I had to correlate the logs with a code analysis and multiple runs through the debugger. Anything you can do to help me here would be most welcome! |
Since you seem to be able to reproduce this issue quite easily on Jetty 12.0.16, here are a few things I'd be glad if you could try (some of) them out:
Thanks! |
@lorban I will do everything you have ask for and supply you with the logs for each case, but not sure if I am able to do it this week due some high priority tasks. I promise to get back to as soon as possible. In our journey of trying fix the issue our users reported before we filled issue here, we have already tried to use Core Handler API instead of servlets but the bug was still there. Same as if we used I will also put an effort to make this reproducible on the localhost. |
@lorban I managed to make some tests. Issue is still here on Jetty 12.0.16 with EE10 servlets, Jetty 12.0.17-SNAPSHOT with EE9 and Jetty 12.1.0-SNAPSHOT with EE9. jetty-12.0.16-ee10.log.gz Bad news is that on 12.1.0-SNAPSHOT it fails every time, in Chrome you don't even have to click the seek bar. Just open page and wait for 3 seconds. In Firefox 2-3 clicks on the seekbar and it fails as well. It seems that behaviour in 12.1.0 is much worse, but I am still not able to reproduce it on localhost even with network throttling. |
with |
Signed-off-by: Ludovic Orban <[email protected]>
Signed-off-by: Ludovic Orban <[email protected]>
@ivosek Just to give you a quick status update on this issue: 12.0 uses the same logic as 11.0 to remove the buffer from the pool on error. Unfortunately, it looks like there is a race condition somewhere that makes it fail more or less easily. Some troubleshooting effort is still needed to figure out what exactly is broken. 12.1 has a much more involved mechanism that allows always re-using the buffer, but its implementation was incomplete. #12727 is now about solving that problem once and for all. We'll come back to you once we made sufficient progress to help us testing our fixes if you can spare a bit more time. |
#12776 do not re-pool the buffer when releasing upon failure and HTTP version >= 2 as the H2 flusher might still reference it Signed-off-by: Ludovic Orban <[email protected]>
Jetty version(s)
12.0.16 (11.0.x version since 11.0.21)
Jetty Environment
Embedded + EE9
Java version/vendor (use: java -version)
OS type/version
Ubuntu 24, Arch Linux, Manjaro...
Description
When web browsers plays video / audio and sending requests with header Range and request got cancelled (for example because user is seeking in video),
all other currently processed requests gets cancelled (there is HTTP2_ERROR in Chrome or Firefox network console).
Issue occurs only when HTTP/2 is used. It works on Jetty 11.0.20, but fails on Jetty 11.0.21, 11.0.22, 11.0.23, 11.0.24 and 12.0.16.
When we disable HTTP/2 or revert this commit on Jetty 11.0.24 it never fails.
This issue also never occured when tested on localhost or LAN (probably due to transfer speed).
How to reproduce?
jetty-issue.zip
Run attached code with JDK 21 using:
It starts Jetty 12.0.16 on port 9443 with HTTP/2 connections, downloads sample video
and starts to serve contents of
web
folder and provides endpoint/video
for serving video data and/sse
as Server Sent Events source.When you open web browser https://IP:9443 it estabilish Server Send Event connections and starts to play video.
Try to seek in the video before it gets fully buffered to web browser. You can see that SSE connection will fail (alert will be shown).
NOTE: We were never able to reproduce this when using localhost or LAN address. Only way to reproduce this is to forward port to public IP address and access to address "in internet" (we have tried multiple ISP to be sure that this issue is not caused by some firewalls or network devices).
Our only thought why it happens only when access from internet is effect of transfer speed and thus different buffer usage.
In TCP dump you can see HTTP2 GOAWAY was sent.
Reverting back to Jetty 11.0.20 helps. We also tried to revert commit commit on Jetty 11.0.24 and it also works.
On Jetty 12 our only solution to fix this is to set
ByteBufferPool.NON_POOLING
in constructor of JettyServer
instance.The text was updated successfully, but these errors were encountered: