Warmup and optimization seems to slow inferencing #18

berombau · 2022-10-20T13:10:59Z

When I use Engine3D to inference, PyTorch optimization after the first batch slows down the whole inferencing run. Is this normal behavior? The whole inferencing is faster when disabling optimization, controlled by torch.jit.optimized_execution(should_optimize=BOOLEAN), mentioned here. Running on MitoNet data on a CentOS workstation with an A30.

def stack_inference(engine, volume, axis_name):
    with torch.jit.optimized_execution(should_optimize=True):
        stack, trackers = engine.infer_on_axis(volume, axis_name)

With optimization:

(empanada) [benjaminr@cp0293 empanada]$ time python inference_3D.py 
model config: {'class_names': {1: 'mito'}, 'labels': [1], 'thing_list': [1], 'model': 'https://zenodo.org/record/6861565/files/MitoNet_v1.pth?download=1', 'model_quantized': 'https://zenodo.org/record/6861565/files/MitoNet_v1_quantized.pth?download=1', 'padding_factor': 16, 'norms': {'mean': 0.57571, 'std': 0.12765}, 'description': 'MitoNet_v1 was trained on the large CEM-MitoLab dataset and is a generalist for mitochondrial segmentation. The underlying architecture is PanopticDeeplab. This model is fairly large but powerful. If GPU memory is a limitation, try using MitoNet_v1_mini instead. Read the preprint: https://www.biorxiv.org/content/10.1101/2022.03.17.484806\n', 'FINETUNE': {'criterion': 'PanopticLoss', 'criterion_params': {'ce_weight': 1, 'l1_weight': 0.01, 'mse_weight': 200, 'pr_weight': 1, 'top_k_percent': 0.2}, 'dataset_class': 'SingleClassInstanceDataset', 'dataset_params': {'weight_gamma': 0.7}, 'engine': 'PanopticDeepLabEngine', 'engine_params': {'confidence_thr': 0.5, 'label_divisor': 1000, 'nms_kernel': 7, 'nms_threshold': 0.1, 'stuff_area': 64, 'thing_list': [1], 'void_label': 0}}}
Predicting xy...
  0%|                                                                                                                                                                     | 0/512 [00:00<?, ?it/s]/srv/scratch/benjaminr/anaconda3/envs/empanada/lib/python3.9/site-packages/torch/nn/modules/module.py:1130: UserWarning: operator() sees varying value in profiling, ignoring and this should be handled by GUARD logic (Triggered internally at  /opt/conda/conda-bld/pytorch_1659484806139/work/torch/csrc/jit/codegen/cuda/parser.cpp:3513.)
  return forward_call(*input, **kwargs)
  0%|▎                                                                                                                                                            | 1/512 [00:05<47:01,  5.52s/it]/srv/scratch/benjaminr/anaconda3/envs/empanada/lib/python3.9/site-packages/torch/nn/modules/module.py:1130: UserWarning: operator() profile_node %3387 : int[] = prim::profile_ivalue(%3385)
 does not have profile information (Triggered internally at  /opt/conda/conda-bld/pytorch_1659484806139/work/torch/csrc/jit/codegen/cuda/graph_fuser.cpp:104.)
  return forward_call(*input, **kwargs)
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 512/512 [01:00<00:00,  8.53it/s]
Propagating labels backward through the stack...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 512/512 [00:02<00:00, 181.20it/s]
Post-processing {'xy': [<empanada.inference.tracker.InstanceTracker object at 0x7fb252851880>]}
Creating stack segmentation for class mito...
Total mito objects 190

real    1m19.001s
user    1m24.997s
sys     0m12.129s

Without optimization:

(empanada) [benjaminr@cp0293 empanada]$ time python inference_3D.py 
model config: {'class_names': {1: 'mito'}, 'labels': [1], 'thing_list': [1], 'model': 'https://zenodo.org/record/6861565/files/MitoNet_v1.pth?download=1', 'model_quantized': 'https://zenodo.org/record/6861565/files/MitoNet_v1_quantized.pth?download=1', 'padding_factor': 16, 'norms': {'mean': 0.57571, 'std': 0.12765}, 'description': 'MitoNet_v1 was trained on the large CEM-MitoLab dataset and is a generalist for mitochondrial segmentation. The underlying architecture is PanopticDeeplab. This model is fairly large but powerful. If GPU memory is a limitation, try using MitoNet_v1_mini instead. Read the preprint: https://www.biorxiv.org/content/10.1101/2022.03.17.484806\n', 'FINETUNE': {'criterion': 'PanopticLoss', 'criterion_params': {'ce_weight': 1, 'l1_weight': 0.01, 'mse_weight': 200, 'pr_weight': 1, 'top_k_percent': 0.2}, 'dataset_class': 'SingleClassInstanceDataset', 'dataset_params': {'weight_gamma': 0.7}, 'engine': 'PanopticDeepLabEngine', 'engine_params': {'confidence_thr': 0.5, 'label_divisor': 1000, 'nms_kernel': 7, 'nms_threshold': 0.1, 'stuff_area': 64, 'thing_list': [1], 'void_label': 0}}}
Predicting xy...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 512/512 [00:23<00:00, 21.46it/s]
Propagating labels backward through the stack...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 512/512 [00:02<00:00, 220.48it/s]
Post-processing {'xy': [<empanada.inference.tracker.InstanceTracker object at 0x7fd31cf2c610>]}
Creating stack segmentation for class mito...
Total mito objects 190

real    0m42.225s
user    0m49.972s
sys     0m11.978s

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warmup and optimization seems to slow inferencing #18

Warmup and optimization seems to slow inferencing #18

berombau commented Oct 20, 2022

Warmup and optimization seems to slow inferencing #18

Warmup and optimization seems to slow inferencing #18

Comments

berombau commented Oct 20, 2022