Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect parsing of multipart data results in unfilled form parameters and NPE in logs #291

Open
p-himik opened this issue May 8, 2019 · 15 comments

Comments

@p-himik
Copy link

p-himik commented May 8, 2019

yada 1.3.0-alpha10.
Take a look at this:
image

I don't think I can provide you the data itself, but let me know if I can provide anything else.

@p-himik
Copy link
Author

p-himik commented May 8, 2019

Trying to debug this myself.
I placed a breakpoint in yada.multipart/copy-bytes-after-delimiter and I'm trying to look through the values of window.
Of course maybe I don't understand something and window can be reused partially, but right now it seems that there's broken content - sometimes no content-type after the boundary, sometimes mixed content of XML and PDF files that I'm trying to upload. As if there's some concurrency issue and multiple threads are writing to the same buffer without using any locks.

@malcolmsparks
Copy link
Contributor

Do you have any test data you can share that replicates the issue?

@p-himik
Copy link
Author

p-himik commented May 8, 2019

Couldn't share the original data but managed to reproduce the issue with some completely random data. Note that it's rather sensitive to the sizes.
Form data:

------WebKitFormBoundaryAhflCu7e161nGb1Z
Content-Disposition: form-data; name="edit"

["^ ","~:media",["^ ","~i-43",["^ ","~:name","1111122222aaaaabbbbbccccc.sib","~:kind","~:main-sib","~:score-id",-1,"~:id",-43],"~i-44",["^ ","^2","11111222223333.pdf","^3","~:main-pdf","^5",-1,"^6",-44],"~i-45",["^ ","^2","11111111aaaaabbbbbccccccc.xml","^3","~:main-xml","^5",-1,"^6",-45]],"~:score",["^ ","~:description","","~:sequence-type-display-text","interval","~:default-tempo",100,"~:collection-id",49,"~:difficulty","Intermediate","~:transpose-limit",null,"~:percent-random",0,"~:transpose",false,"^2","aaaaaaaa","~:duration",90,"~:tempo-control-mode","Multiplier","~:play-as-chords",false]]
------WebKitFormBoundaryAhflCu7e161nGb1Z
Content-Disposition: form-data; name="main-sib"; filename="1111122222aaaaabbbbbccccc.sib"
Content-Type: application/octet-stream


------WebKitFormBoundaryAhflCu7e161nGb1Z
Content-Disposition: form-data; name="main-pdf"; filename="11111222223333.pdf"
Content-Type: application/pdf


------WebKitFormBoundaryAhflCu7e161nGb1Z
Content-Disposition: form-data; name="main-xml"; filename="11111111aaaaabbbbbccccccc.xml"
Content-Type: text/xml


------WebKitFormBoundaryAhflCu7e161nGb1Z--

The files themselves: files.zip

@p-himik
Copy link
Author

p-himik commented May 9, 2019

Huh, it's not that rare actually - I just encountered it 3 more times with different data.
Also, I just noticed that there's this in the logged stack traces:

clojure.lang.ExceptionInfo: No end delimiter
at yada.multipart$finish_up.invokeStatic(multipart.clj:305)
at yada.multipart$finish_up.invoke(multipart.clj:229)
at yada.multipart$process_chunk.invokeStatic(multipart.clj:340)
at yada.multipart$process_chunk.invoke(multipart.clj:338)
at yada.multipart$parse_multipart$this__10282__auto____41431$fn__41432$fn__41433.invoke(multipart.clj:426)
at manifold.deferred$fn__10164$chain___10185.invoke(deferred.clj:859)
at manifold.deferred$fn__10164$subscribe__10165$fn__10174.invoke(deferred.clj:815)
at manifold.deferred.Listener.onSuccess(deferred.clj:219)
at manifold.deferred.Deferred$fn__9975$fn__9976.invoke(deferred.clj:398)
at clojure.lang.AFn.run(AFn.java:22)
at io.aleph.dirigiste.Executor$Worker$1.run(Executor.java:62)
at manifold.executor$thread_factory$reify__9488$f__9489.invoke(executor.clj:47)
at clojure.lang.AFn.run(AFn.java:22)
at java.base/java.lang.Thread.run(Thread.java:834)

But the initial error is still there in the stack traces as well, so I don't really think that the "No end delimiter" is really relevant.

@p-himik
Copy link
Author

p-himik commented May 10, 2019

@malcolmsparks I think this line https://github.com/juxt/yada/blob/master/ext/multipart/src/yada/multipart.clj#L187 should use copy-bytes-after-delimiter instead. What do you think?

@leguma
Copy link

leguma commented Jul 2, 2019

Clarification: I'm seeing similar symptoms with version 1.2.15 and 1.2.16.

@p-himik I suspect I'm running into this exact issue as well. It seems to trigger more often with certain arg combinations, and for certain combos it fails 50% of the time locally, 100% of the time on our server. It definitely smells like a concurrency issue of some sort.

Did ya' try out that change that you suggested?

@p-himik
Copy link
Author

p-himik commented Jul 8, 2019

@leguma Yep, so far so good since I've created this issue. I just copied the file into the source directory of my own project and modified it there. The classpath preference makes sure that it's the one that's used instead of the yada's original one.

@leguma
Copy link

leguma commented Jul 8, 2019

@p-himik I verified that fixing that line worked! In case someone wants some reproduction information, here's a bunch of tests I ran attempting to isolate the issue:
Yada Multipart Testing.xlsx

It can be difficult to reproduce with a universal set of args since it seems very platform specific, but you should be able to do so with some tinkering.

I'll submit a PR with the fix, since I don't see one submitted yet for some reason. For anyone coming in before it's in a stable release, modify the line that @p-himik mentioned (pass in m directly instead of destructuring it):
:bytes (copy-bytes-after-delimiter m s e)

I think the issue has something to do with multipart chunking coming in different orders sometimes? Example good and bad slices it's trying to parse:

Failure:
([] [----------------------------035613219007199590421006] [Content-Disposition form-data; name="file2"; filename="file2.pdf"] [Content-Type application/pdf])

Success:
([Content-Disposition form-data; name="file1"; filename="file1.jpg"] [Content-Type image/jpeg])

When it goes to grab info from those parsed headers, it grabs first and second elements, which are obviously wrong in the failure example.

@malcolmsparks
Copy link
Contributor

This is good news. Which platforms are involved? I have been unable to replicate the issue on Linux, but very happy to review/merge a PR

@p-himik
Copy link
Author

p-himik commented Jul 9, 2019

@malcolmsparks I am on Linux as well, I was able to replicate it locally and on Heroku. But as I mentioned on Gitter, I was unable to make a reliable test case.
I don't think it's platform specific - seems to be dependent on the overall scenario of execution. I.e. the test data I have provided worked when I used it with a running application but didn't work in a test. Most likely, one can just tweak payload size while taking CHUNK-SIZE into account (maybe some other buffer size, can't tell for sure) to make it reproducible in a test.

@malcolmsparks
Copy link
Contributor

That's useful info. I'll try to replicate locally in that case. The implementation was subjected to a round of property-based testing but obviously that didn't catch this.

@p-himik
Copy link
Author

p-himik commented Jul 9, 2019

But I think that the fix that I've come up with can and should be implemented even if it proves impossible to create a robust test case. After all, copy-bytes-after-delimiter is definitely the correct method to use there since it is expected to have a boundary delimiter there.

@malcolmsparks
Copy link
Contributor

Yes I agree.

@leguma
Copy link

leguma commented Jul 9, 2019

This is good news. Which platforms are involved? I have been unable to replicate the issue on Linux, but very happy to review/merge a PR

It happens on any platform. I've reproduced it in a Linux docker image, a Windows machine, and on OSX. As ya' can see from the tests I ran, the order and size of the files seem to matter, as well as some fuzziness caused by environment-specific stuff. I have reproduced it with XHR and Postman. The only pattern that I can immediately point to is "large" files followed by "small" ones may somehow be a trigger. Try using 4+ files of differing sizes and juggling them around until you trip the switch.

@leguma
Copy link

leguma commented Jul 16, 2019

@malcolmsparks Wanna review #303? It should take all of 10 seconds. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants