-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving nnUNet inference speed #2048
Conversation
* changing `lru_cache`'s `maxize=2` to `maxsize=None` for faster access and because cache will not be filled with more than 2 values
* Added n_predictions back to replicate the previous behavior * Setting gaussian to 1 if not using the gaussian * setting lru cache size back to 2 to prevent OOM for unintended usage
Thanks! Had to make two small changes to make sure everything works + yields the same results. |
Tensors created during inference mode are slimmer because they do not have a version counter and requires_grad can't be set to True anymore. More information here: https://stackoverflow.com/a/74197846/18441695. |
Main additions are:
torch.backends.cudnn.benchmark = True
(used during training, but forgot during inference).torch.no_grad
context with a singletorch.inference_mode
context.