You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello everyone, has anyone noticed a drop in performance (test and validation loss) when training with dtype=float32 ? I'm doing training on the Shakespeare dataset with the train_shakespeare_char config file.
I have not changed anything but dtype from the original repo.
It seems related to the use of "torch.amp.autocast" but I don't understand why a higher precision from bfloat16 to float32 would cause a drop in perf.
Thank you for your help !
The text was updated successfully, but these errors were encountered:
It is just a speculation, but maybe it introduces some form of regularization in the model, given the small size of the data and the model it may actually allow it to generalize better?
Hello everyone, has anyone noticed a drop in performance (test and validation loss) when training with dtype=float32 ? I'm doing training on the Shakespeare dataset with the train_shakespeare_char config file.
I have not changed anything but dtype from the original repo.
It seems related to the use of "torch.amp.autocast" but I don't understand why a higher precision from bfloat16 to float32 would cause a drop in perf.
Thank you for your help !
The text was updated successfully, but these errors were encountered: