Adafactor optimizer work? #2480

toriving · 2022-11-07T05:45:15Z

toriving
Nov 7, 2022

hello.

I am training a model using Deepspeed.
I tried using the adafactor (non-native) as an optimizer.
However, it was confirmed that the loss did not decrease, and this phenomenon only occurs when using Deepspeed.

Doesn't Deepspeed currently support the Adaptive optimizer style?
Or do it not support Adafactor?

tjruwase · 2022-11-07T12:34:59Z

tjruwase
Nov 7, 2022
Maintainer

@toriving, thanks for your question. Can you please create an issue so we can investigate further?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adafactor optimizer work? #2480

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Adafactor optimizer work? #2480

toriving Nov 7, 2022

Replies: 1 comment

tjruwase Nov 7, 2022 Maintainer

toriving
Nov 7, 2022

tjruwase
Nov 7, 2022
Maintainer