-
Notifications
You must be signed in to change notification settings - Fork 722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High loss and low bleu-4 for training #192
Comments
What exactly have you changed in the code? Be wary of erasing things like global best_bleu4, epochs_since_improvement, checkpoint, start_epoch, fine_tune_encoder, data_name, word_map PEP will mark those as warnings, but here they they have a good use. |
I just change the code "scores, _ = pack_padded_sequence(scores, decode _lengths, batch_first = True)" to "scores = pack_padded_sequence(scores, decode _lengths, batch_first = True).data " to debug. I also change some data parameters in the begin of train.py but I don't think it would influence a lot. I didn't change the global parameters code. Do you know how to make the loss convergence? Should I lower the learning rate? |
Have you tried this fix instead? |
Yeah, I just delete the '_', but the cross entrypy loss must accept two tensor parameters. So I add the '.data' to the end of this code. |
That's true. They should be the same in the loss by using |
My trian.py works, but the loss just not decreases. |
I change the code "scores = pack_padded_sequence(scores, decode_lengths, batch_first=True)[0]", |
When I train a new model in flickr8k and flickr30k dataset in my environment, I find that the trianing loss is too high(about 10) and the bleu-4 is too low(about 2.4e-232) after 20 epochs. It is also very strange that the parameter epochs since last improvement is 20. I didn't change the train.py code except some small bugs. How can I improve it? Is anyone having the same problem? THANKS!!!

The text was updated successfully, but these errors were encountered: