Replies: 1 comment
-
Found a solution by creating my own optimizer and giving this to deepspeed.initialize(). |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello all!
I have to somehow access (and manipulate) the gradients of a PP model.
My goal is to precondition the gradients before the optimization step happens.
I thought of using the PipeSchedule, but there is no tutorial on how to use it.
Is there any way or a chance to do so?
Cheers and thanks for any help!
Beta Was this translation helpful? Give feedback.
All reactions