-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix parallel access crashes and misbehavior #136
Conversation
flexdll.c
Outdated
goto again; | ||
} | ||
} else { | ||
if (WaitForSingleObject(units_mutex, INFINITE) == WAIT_FAILED) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (WaitForSingleObject(units_mutex, INFINITE) == WAIT_FAILED) { | |
if (WaitForSingleObject(units_mutex, INFINITE) != WAIT_OBJECT_0) { |
This would cover the improbable (impossible?) case of the mutex having been abandoned.
https://learn.microsoft.com/en-us/windows/win32/sync/using-mutex-objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up reverting this one.
WAIT_ABANDONED
does not offer GetLastError
-integration, so I started adding an explicit error message case for it - only to then hit it during testing.
I therefore suppose these two cases were written to handle both WAIT_OBJECT_0
and WAIT_ABANDONED
as success-cases (WAIT_TIMEOUT
should not happen with INFINITE
...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning I think is that WAIT_ABANDONED should be treated like success?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(i.e. I agree with @jmid's assessment!)
err = get_tls_error(TLS_ERROR_NOP); | ||
if(err == NULL) return NULL; | ||
|
||
if (WaitForSingleObject(units_mutex, INFINITE) == WAIT_FAILED) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (WaitForSingleObject(units_mutex, INFINITE) == WAIT_FAILED) { | |
if (WaitForSingleObject(units_mutex, INFINITE) != WAIT_OBJECT_0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(same for this one)
217db5d
to
67efa7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested an finally understood why the code is correct, thanks Jan! I think this is good to go.
Minor suggestion still.
Co-authored-by: Antonin Décimo <[email protected]>
47480bc
to
4950924
Compare
I've addressed the last minor comment, added a CHANGES entry, and rebased on |
Thank you both for your work on this, and sorry for taking quite so long to merge it! |
Parallel usage is memory unsafe (read: may crash) as documented in #120, ocaml/ocaml#11607, and ocaml/ocaml#13046.
This PR goes for the simplest possible fix: adding a single global lock by dusting off the first commit of https://github.com/dra27/flexdll/tree/sledgehammer and suitable rebasing, renaming, and error handling.
Author credit thus goes to @dra27 - any errors are mine.
For the error handling, I've tried to make it fit with @shym's TLS-based error handling from #112.
I'm unsure how to test these error code paths though without explicitly mocking with the source code to create an invalid lock handle.
With the fix
tests/lib-dynlink-domains
test from Temporarily disable the lib-dynlink-domains test on Windows ocaml#11607Dynlink
stress test frommulticoretests
passes(these have been tested under MinGW in a Cygwin-shell)