Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reinstall signal handler when Python signal handler called #228

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

user202729
Copy link
Contributor

@user202729 user202729 commented Feb 27, 2025

This would fix sagemath/sage#39601 . Not sure if it has any downside.

@user202729 user202729 marked this pull request as draft February 27, 2025 18:15
@user202729 user202729 force-pushed the reinstall-signal-handler branch 2 times, most recently from 24aa6c2 to 231c981 Compare February 28, 2025 03:55
@user202729
Copy link
Contributor Author

user202729 commented Feb 28, 2025

These two tests are failing: test_interrupt_bomb and test_sig_block_outside_sig_on.

The test can be fixed by adding appropriate PyErr_CheckSignals , but that feels like fixing the symptoms.

On the other hand a test could be added. It could be as simple as

import signal
from signal import SIGINT
signal.signal(SIGINT, signal.getsignal(SIGINT))

@user202729 user202729 force-pushed the reinstall-signal-handler branch from 231c981 to df2b512 Compare February 28, 2025 03:59
@user202729 user202729 marked this pull request as ready for review February 28, 2025 03:59
@tornaria
Copy link
Contributor

The point of having a os-level interrupt handler is to rollback the process (using longjmp) to a state where the python signal handler can be called.

IOW, I don't think this will fix anything until the python handler is called; the problem is when the python handler is NOT called (this is the reason why cysignals needs to use a os-level interrupt handler).

I think there's something special about time.sleep() that allows it to be interrupted with a python signal handler (i.e. cysignals is not needed here). Your benchmark should be to be able to interrupt code like this:

$ cat forever.pyx 
from cysignals.signals cimport sig_on, sig_off

def run():
    sig_on()
    while True:
        pass
    sig_off()

I believe the only way to fix this is for matplotlib to restore the os-level interrupt handler. See prompt-toolkit/python-prompt-toolkit#1576 for a similar issue and prompt-toolkit/python-prompt-toolkit#1822 for the fix.

A different workaround might be to make sure the signal handler is correctly installed at each prompt (I'm not sure ipython has a hook for that... I think prompt_toolkit has it, but we don't use prompt_toolkit directly afair). This would not be good for scripts but at least an improvement for interactive use.

@user202729
Copy link
Contributor Author

user202729 commented Mar 1, 2025

  1. Fixing matplotlib would be a good idea.

  2. It's true that fixes such as this one won't fix code of the following form

    plt.pause(2)
    [wait for 1 second]
    [Ctrl-C]
    longRunningCythonFunction()
    [Ctrl-C]
    

    but at least it will fix the following case (which is much more common)

    plt.pause(2)
    [wait for 1 second]
    [Ctrl-C]
    longRunningPurePythonFunction()
    [Ctrl-C]
    
  3. In an ideal world the library that changes/restores signal handler should be responsible for it. In practice programs are preferably bug-proof i.e. if other programs are buggy we should try our best to gracefully handle error.

  4. Looks like this pull request actually causes some trouble (when automatic pdb call on error is enabled, it ends up invoking pdb twice on keyboard interrupt…? or maybe that's an irrelevant issue)

    though any idea why the behavior change happens? Could the python signal handler be ran twice? (how exactly does PyErr_SetInterrupt and PyErr_CheckSignals work?)

@user202729 user202729 marked this pull request as draft March 1, 2025 01:27
@tornaria
Copy link
Contributor

tornaria commented Mar 1, 2025

  1. Fixing matplotlib would be a good idea.

It's not even that hard, see my fix for prompt_toolkit. It went through a lot of iterations to make it so simple: pure python (thanks to ctypes) and using only the stable abi. This was fundamental to convince upstream to merge it.

2. It's true that fixes such as this one won't fix code of the following form
   ```
   plt.pause(2)
   [wait for 1 second]
   [Ctrl-C]
   longRunningCythonFunction()
   [Ctrl-C]
   ```

This is the whole (only?) point of using cysignals in the first place.

   but at least it will fix the following case (which is much more common)
   ```
   plt.pause(2)
   [wait for 1 second]
   [Ctrl-C]
   longRunningPurePythonFunction()
   [Ctrl-C]
   ```

We don't need cysignals for this. Whether this is more or less common is debatable, but the whole point of sagemath is to integrate external libraries which do non-trivial computations.

3. In an ideal world the library that changes/restores signal handler should be responsible for it. In practice programs are preferably bug-proof i.e. if other programs are buggy we should try our best to gracefully handle error.

This is free software with an open issue tracker: we can go fix matplotlib. Maybe we can even come up with something useful to add to python signal module (and documentation!) that will make it easier to preserve signals.

Note that the original bug in prompt_toolkit was they did not restore the sigint handler at all (not even the python one) and so your workaround would not have done anything in that situation either.

Could you try to save the os-level sigint handler at https://github.com/matplotlib/matplotlib/blob/main/lib/matplotlib/backend_bases.py#L1635 (you can adapt my changes in prompt-toolkit/python-prompt-toolkit#1822)

4. Looks like this pull request actually causes some trouble (when automatic pdb call on error is enabled, it ends up invoking pdb twice on keyboard interrupt…? or maybe that's an irrelevant issue)
   though any idea why the behavior change happens? Could the python signal handler be ran twice? (how exactly does `PyErr_SetInterrupt` and `PyErr_CheckSignals` work?)

Reentrancy is hard. I'm not sure calling init_cysignals() from within the handler is safe. The fact that you had to modify tests seems to indicate it might not be.

@user202729
Copy link
Contributor Author

We don't need cysignals for this.

Yes, but that's not the point: my point is [matplotlib + no cysignals] leaves interruption of Python code working, but [matplotlib + cysignals] breaks interruption of Python code in this specific case.

This is free software with an open issue tracker: we can go fix matplotlib.

I mean, yes we're lucky in this case, but the next time some user uses some other library that also backups/restores the signal handler the same way we should probably at least print out a warning message to tell the user what to do instead of making them going through maybe an hour of debugging.

If there is a reliable way of detecting this condition, that is. Looks like the function is already called without any signal.setsignal() going on (thus existing tests break), and I have no idea why.

@tornaria
Copy link
Contributor

tornaria commented Mar 1, 2025

We don't need cysignals for this.

Yes, but that's not the point: my point is [matplotlib + no cysignals] leaves interruption of Python code working, but [matplotlib + cysignals] breaks interruption of Python code in this specific case.

I understand what you mean. But by the same reasoning, [cysignals + no matplotlib] leaves interruption of python and non-python code working, but [cysignals + matplotlib] breaks interruption of python code and non-python code.

While "fixing" this on the cysignals side might be worthwhile, it won't fix the interruption of non-python code. OTOH, fixing this on the matplotlib side should be very easy and fix everything.

I mean, yes we're lucky in this case, but the next time some user uses some other library that also backups/restores the signal handler the same way we should probably at least print out a warning message to tell the user what to do instead of making them going through maybe an hour of debugging.

Another library may change the signal handler and not restore it. For instance, prompt_toolkit used to do that. In this case, matplotlib does it, but not completely. Easy to fix.

If there is a reliable way of detecting this condition, that is. Looks like the function is already called without any signal.setsignal() going on (thus existing tests break), and I have no idea why.

I think this goes back to the idea of doing something at each prompt: we could check the signal handler to see if it has been modified, and in that case warn the user (and fix it). This may be better than doing it in the interrupt handler itself, because (a) it avoids reentrancy (b) it will work in more cases (c) it will warn the user.

Another option, perhaps, is to replace the signal module and perhaps intercept signal.signal() because that seems to be the root cause of the problem, since python overrides the os-level signal handler back to its own, leaving cysignals outside

@user202729
Copy link
Contributor Author

I think this goes back to the idea of doing something at each prompt: we could check the signal handler to see if it has been modified, and in that case warn the user (and fix it).

This does sound like a reasonable idea: it will catches similar error cases.

Another option, perhaps, is to replace the signal module and perhaps intercept signal.signal() because that seems to be the root cause of the problem, since python overrides the os-level signal handler back to its own, leaving cysignals outside

Possible but might have unintended side effect? (might happen if someone try to inspect.getsource the function, but I think this is very unlikely)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Interruption breaks after matplotlib plt.pause()
2 participants