Improving nnUNet inference speed #2048

ancestor-mithril · 2024-03-28T20:49:16Z

Main additions are:

Using torch.backends.cudnn.benchmark = True (used during training, but forgot during inference).
Replacing torch.no_grad context with a single torch.inference_mode context.
Using inplace operations for calculating and using the gaussian.

* changing `lru_cache`'s `maxize=2` to `maxsize=None` for faster access and because cache will not be filled with more than 2 values

nnunetv2/inference/predict_from_raw_data.py

nnunetv2/inference/sliding_window_prediction.py

* Added n_predictions back to replicate the previous behavior * Setting gaussian to 1 if not using the gaussian * setting lru cache size back to 2 to prevent OOM for unintended usage

FabianIsensee · 2024-04-11T13:32:38Z

Thanks! Had to make two small changes to make sure everything works + yields the same results.
What would you say is the advantage of @torch.inference_mode() over with torch.no_grad()? Prediction speed seems to be the same

ancestor-mithril · 2024-04-11T13:58:58Z

Tensors created during inference mode are slimmer because they do not have a version counter and requires_grad can't be set to True anymore. More information here: https://stackoverflow.com/a/74197846/18441695.
In my tests, @torch.inference_mode() is slightly less than 1% faster than no_grad for nnUNet.

ancestor-mithril added 7 commits March 28, 2024 22:18

Improved data iterators

d87fa5b

Using backends.cudnn.benchmark during prediction for faster inference

ea88fb7

Using torch.inference_mode for prediction

db7bec7

Removed tuple unpacking for mirroring and predicting

a3e1170

Using gaussian only when necessary + applied inplace addition

cf7696e

Compute gaussian

c996ba2

* changing `lru_cache`'s `maxize=2` to `maxsize=None` for faster access and because cache will not be filled with more than 2 values

Faster compute_gaussian

2d951e3

FabianIsensee self-assigned this Mar 28, 2024

FabianIsensee requested changes Apr 9, 2024

View reviewed changes

Fixed after review

809277b

* Added n_predictions back to replicate the previous behavior * Setting gaussian to 1 if not using the gaussian * setting lru cache size back to 2 to prevent OOM for unintended usage

ancestor-mithril requested a review from FabianIsensee April 10, 2024 09:51

FabianIsensee merged commit 3a2d870 into MIC-DKFZ:master Apr 11, 2024
1 check passed

ancestor-mithril deleted the dev2 branch April 11, 2024 13:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving nnUNet inference speed #2048

Improving nnUNet inference speed #2048

ancestor-mithril commented Mar 28, 2024

FabianIsensee commented Apr 11, 2024

ancestor-mithril commented Apr 11, 2024

Improving nnUNet inference speed #2048

Improving nnUNet inference speed #2048

Conversation

ancestor-mithril commented Mar 28, 2024

FabianIsensee commented Apr 11, 2024

ancestor-mithril commented Apr 11, 2024