You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
looking at train_utils.py:581, in the eval_reconstruction section, there doesn't seem to be any sync of eval values between the GPUs. So each GPU computes rFID etc on part of the eval set, but then only the value from main GPU is reported. Not sure if I am looking at it right or if you guys are aware of this and want it this, but wanted to bring it to your attention just in case.
The text was updated successfully, but these errors were encountered:
The current code should run evaluation on the full val set on each device (GPU), so the online evaluation still has the correct numbers but the computation could be wasted (since each device will test the whole dataset). We will check on our side and update with a distributed version to speed up the evaluation process soon. Feel free to let me know if you have any question :)
Hello,
looking at train_utils.py:581, in the eval_reconstruction section, there doesn't seem to be any sync of eval values between the GPUs. So each GPU computes rFID etc on part of the eval set, but then only the value from main GPU is reported. Not sure if I am looking at it right or if you guys are aware of this and want it this, but wanted to bring it to your attention just in case.
The text was updated successfully, but these errors were encountered: