You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While debugging a failed InterpreterTest (https://github.com/restatedev/e2e/actions/runs/8831342633/job/24246356686?pr=324), I noticed that the failure criterion for side effects uses Math.random in the interpreter.ts to decide whether to succeed or fail a side effect. Because of this, I was not able to reproduce the failed test run from GHA because the underlying failure cause was too many retries which exceeded the test timeout.
I am wondering whether it would be possible to make the InterpreterTest fully deterministic to help with debugging failed runs. If we could set a stable seed for the RNG which we use for failing side effects, then this could remove this source of non-determinism. Alternatively, maybe the program already encodes how often a side effect should fail before succeeding. I guess the challenge would be to track how often something has already been retried/what was the last state of the RNG.
The text was updated successfully, but these errors were encountered:
You’ve pointed out the challenges correctly Till.
It is preferable not to keep any external state (like a file, or a 3rd party)
but I’ll keep thinking about it.
regarding a particular test failure, we need to make sure that the timeout is 10x of an avg run, to avoid any false positives.
While debugging a failed
InterpreterTest
(https://github.com/restatedev/e2e/actions/runs/8831342633/job/24246356686?pr=324), I noticed that the failure criterion for side effects usesMath.random
in the interpreter.ts to decide whether to succeed or fail a side effect. Because of this, I was not able to reproduce the failed test run from GHA because the underlying failure cause was too many retries which exceeded the test timeout.I am wondering whether it would be possible to make the
InterpreterTest
fully deterministic to help with debugging failed runs. If we could set a stable seed for the RNG which we use for failing side effects, then this could remove this source of non-determinism. Alternatively, maybe the program already encodes how often a side effect should fail before succeeding. I guess the challenge would be to track how often something has already been retried/what was the last state of the RNG.The text was updated successfully, but these errors were encountered: