Enable memory mapping of 3d tiff files where possible #116

matham · 2025-02-13T03:21:44Z

Description

What is this PR

Bug fix
Addition of a new feature
Other

Why is this PR needed?

Previously, if the input data were a single 3d tiff stack file, cellinder would load the full data into memory. Which is obviously an issue if the file is big and you run out of memory. The best way around it is to use a folder of tiffs because then it is lazily loaded by dask.

What does this PR do?

This PR will use tifffile.memmap which uses np.memmap to memory map the file so it doesn't use much RAM. Exactly like dask does for a folder of tiffs. So it'll try to use memmap, and if it fails, e.g. because the files is compressed etc it'll fall back to loading it into RAM like before.

References

I also added a similar PR to Napari for the same reason: napari/napari#7602.

How has this PR been tested?

I added pytests but also tested it manually on a whole brain.

Here's the timing info using a folder of tiffs, as one baseline:

brainmapper -s "L:\imaging\staging\20250123\MF1_132F_W_tiffs\MF1_132F_W_BS_640" -b "L:\imaging\staging\20250123\MF1_132F_W_tiffs\MF1_132F_W_BS_488" -o "L:\imaging\staging\20250123\MF1_132F_W_tiffs\output" --no-register --no-classification --no-figures --max-cluster-size 100000 --soma-diameter 8 --ball-xy-size 6 --ball-z-size 8 --ball-overlap-fraction 0.5 --log-sigma-size 0.5 --threshold 5 --soma-spread-factor 1.4 -v 4 2.03 2.03 --orientation ial --n-free-cpus 12
2025-02-12 18:12:46 PM INFO     2025-02-12 18:12:46 PM - INFO - MainProcess fancylog.py:314 - Starting logging                                                                                                                                       fancylog.py:314
                       INFO     2025-02-12 18:12:46 PM - INFO - MainProcess fancylog.py:315 - Multiprocessing-logging module found. Logging from all processes                                                                                       fancylog.py:315
                       INFO     2025-02-12 18:12:46 PM - INFO - MainProcess main.py:70 - Skipping registration                                                                                                                                            main.py:70
2025-02-12 18:12:58 PM INFO     2025-02-12 18:12:58 PM - INFO - MainProcess main.py:116 - Detecting cell candidates                                                                                                                                      main.py:116
Processing planes: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2070/2070 [45:41<00:00,  1.32s/it]
2025-02-12 18:58:41 PM INFO     2025-02-12 18:58:41 PM - INFO - MainProcess volume_filter.py:451 - Splitting cell clusters and writing results                                                                                                  volume_filter.py:451
Splitting cell clusters: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 39643/39643 [01:48<00:00, 366.07it/s]
Detection complete. Found 3516361 cells in 0:48:17.364115
2025-02-12 19:04:21 PM INFO     2025-02-12 19:04:21 PM - INFO - MainProcess main.py:195 - Skipping cell classification                                                                                                                                   main.py:195
                       INFO     2025-02-12 19:04:21 PM - INFO - MainProcess main.py:212 - Skipping cell position analysis                                                                                                                                main.py:212
                       INFO     2025-02-12 19:04:21 PM - INFO - MainProcess main.py:241 - Skipping figure generation                                                                                                                                     main.py:241
                       INFO     2025-02-12 19:04:21 PM - INFO - MainProcess main.py:95 - Finished. Total time taken: 0:51:35.354918

Here it is using a single 3d tiff file that loads the whole thing into memory using main:

brainmapper -s "J:\imaging\fused\20250123\MF1_132F_W\MF1_132F_W_BS_640.tif" -b "J:\imaging\fused\20250123\MF1_132F_W\MF1_132F_W_BS_488.tif" -o "L:\imaging\staging\20250123\MF1_132F_W_tiffs\output" --no-register --no-classification --no-figures --max-cluster-size 100000 --soma-diameter 8 --ball-xy-size 6 --ball-z-size 8 --ball-overlap-fraction 0.5 --log-sigma-size 0.5 --threshold 5 --soma-spread-factor 1.4 -v 4 2.03 2.03 --orientation ial --n-free-cpus 12
2025-02-12 20:20:46 PM INFO     2025-02-12 20:20:46 PM - INFO - MainProcess fancylog.py:314 - Starting logging                                                                                                                                       fancylog.py:314
                       INFO     2025-02-12 20:20:46 PM - INFO - MainProcess fancylog.py:315 - Multiprocessing-logging module found. Logging from all processes                                                                                       fancylog.py:315
2025-02-12 20:20:47 PM INFO     2025-02-12 20:20:47 PM - INFO - MainProcess main.py:70 - Skipping registration                                                                                                                                            main.py:70
2025-02-12 20:20:54 PM INFO     2025-02-12 20:20:54 PM - INFO - MainProcess main.py:116 - Detecting cell candidates                                                                                                                                      main.py:116
Processing planes: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2070/2070 [1:01:31<00:00,  1.78s/it]
2025-02-12 21:29:09 PM INFO     2025-02-12 21:29:09 PM - INFO - MainProcess volume_filter.py:451 - Splitting cell clusters and writing results                                                                                                  volume_filter.py:451
Splitting cell clusters: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 39643/39643 [01:49<00:00, 363.68it/s]
Detection complete. Found 3516361 cells in 1:04:22.595883
2025-02-12 21:35:30 PM INFO     2025-02-12 21:35:30 PM - INFO - MainProcess main.py:195 - Skipping cell classification                                                                                                                                   main.py:195
                       INFO     2025-02-12 21:35:30 PM - INFO - MainProcess main.py:212 - Skipping cell position analysis                                                                                                                                main.py:212
                       INFO     2025-02-12 21:35:30 PM - INFO - MainProcess main.py:241 - Skipping figure generation                                                                                                                                     main.py:241
2025-02-12 21:36:04 PM INFO     2025-02-12 21:36:04 PM - INFO - MainProcess main.py:95 - Finished. Total time taken: 1:15:18.007133

Here it is using the changes and using a single 3d tiff file:

brainmapper -s "J:\imaging\fused\20250123\MF1_132F_W\MF1_132F_W_BS_640.tif" -b "J:\imaging\fused\20250123\MF1_132F_W\MF1_132F_W_BS_488.tif" -o "L:\imaging\staging\20250123\MF1_132F_W_tiffs\output" --no-register --no-classification --no-figures --max-cluster-size 100000 --soma-diameter 8 --ball-xy-size 6 --ball-z-size 8 --ball-overlap-fraction 0.5 --log-sigma-size 0.5 --threshold 5 --soma-spread-factor 1.4 -v 4 2.03 2.03 --orientation ial --n-free-cpus 12
2025-02-12 19:23:30 PM INFO     2025-02-12 19:23:30 PM - INFO - MainProcess fancylog.py:314 - Starting logging                                                                                                                                       fancylog.py:314
                       INFO     2025-02-12 19:23:30 PM - INFO - MainProcess fancylog.py:315 - Multiprocessing-logging module found. Logging from all processes                                                                                       fancylog.py:315
                       INFO     2025-02-12 19:23:30 PM - INFO - MainProcess main.py:70 - Skipping registration                                                                                                                                            main.py:70
2025-02-12 19:23:37 PM INFO     2025-02-12 19:23:37 PM - INFO - MainProcess main.py:116 - Detecting cell candidates                                                                                                                                      main.py:116
Processing planes: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2070/2070 [46:58<00:00,  1.36s/it]
2025-02-12 20:10:38 PM INFO     2025-02-12 20:10:38 PM - INFO - MainProcess volume_filter.py:451 - Splitting cell clusters and writing results                                                                                                  volume_filter.py:451
Splitting cell clusters: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 39643/39643 [01:47<00:00, 368.13it/s]
Detection complete. Found 3516361 cells in 0:49:47.688208
2025-02-12 20:16:39 PM INFO     2025-02-12 20:16:39 PM - INFO - MainProcess main.py:195 - Skipping cell classification                                                                                                                                   main.py:195
                       INFO     2025-02-12 20:16:39 PM - INFO - MainProcess main.py:212 - Skipping cell position analysis                                                                                                                                main.py:212
                       INFO     2025-02-12 20:16:39 PM - INFO - MainProcess main.py:241 - Skipping figure generation                                                                                                                                     main.py:241
2025-02-12 20:16:57 PM INFO     2025-02-12 20:16:57 PM - INFO - MainProcess main.py:95 - Finished. Total time taken: 0:53:27.393816

This shows there's no performance penalty of the changes. If something, loading the whole brain into memory beforehand adds some slowdowns.

Looking at the RAM usage during the run, with main there was the obvious RAM usage bump for our process. With my changes, our process used minimal RAM - like with a folder of tiffs. However, I did notice that the system process indicated that it was using progressively more RAM, until by the end it used the amount of RAM as the size of the file. I read this to mean that it cached the memory mapped data under the hood, as it was reading the planes during processing so it was using more RAM at the system level. But, I assume if it gets low on RAM the harddrive would purge and it's not actually using all that RAM.

Is this a breaking change?

I don't think so.

Does this PR require an update to the documentation?

No.

Checklist:

The code has been tested locally
Tests have been added to cover all new functionality (unit & integration)
The documentation has been updated to reflect any changes
The code has been formatted with pre-commit

codecov · 2025-02-13T03:55:33Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.78%. Comparing base (5d79c85) to head (7e898d8).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #116      +/-   ##
==========================================
+ Coverage   92.73%   92.78%   +0.05%     
==========================================
  Files          37       37              
  Lines        1857     1871      +14     
==========================================
+ Hits         1722     1736      +14     
  Misses        135      135

Flag	Coverage Δ
numba	`92.73% <100.00%> (+0.05%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

matham · 2025-02-13T22:24:06Z

It was pointed out in napari/napari#7602 that you can use zarr/dask from tifffile to lazily load tiff files. memmap only works for simple tiffs e.g. where the data is not compressed. So I added zarr as a fallback. However, zarr has to be installed and it also doesn't work will all tiffs, depending on the version, so if zarr and memmap doesn't work we fall back on fully loading.

The main issues is that previously load_img_stack and by extension load_any would not return dask arrays, as far as I can tell. But now it can. I'm not sure if this will be a issue for downstream usage of it. read_z_stack could previously already return dask arrays so there's no change there. Although I'm not sure why there's load_img_stack and read_z_stack.

Also, the order of loading is try memmap -> zarr -> ndarray. I can make memmap fail by compressing the tiff. I can't make zarr reliably fail since it's more of a zarr version thing, so I have no way of having coverage tests for the ndarray part.

IgorTatarnikov · 2025-02-14T10:35:03Z

This works great for me, I tried changing some of our test data (~100GB per channel) into 3D tiffs and there was a significant speed up when using the memmap strategy.

As for the loading order, to avoid running into issues perhaps we avoid the zarr -> dask strategy for now as this might have unintended consequences downstream where users not expecting a dask array. I'll let @alessandrofelder weigh in as well!

alessandrofelder · 2025-02-19T16:35:51Z

Although I'm not sure why there's load_img_stack and read_z_stack.

My understanding/guess of the original intention is that

all the load_any functions are supposed/expected to return a numpy array
read_z_stack is specific and separate for the "folder with 2D tiffs -> dask" array case (while also allowing to pass a 3D tiff that fits in memory). This is only used by brainmapper to pass to cellfinder.main and none of the other (mature) BrainGlobe tools.

I can make memmap fail by compressing the tiff. I can't make zarr reliably fail since it's more of a zarr version thing, so I have no way of having coverage tests for the ndarray part.

Could we mock da.from_zarr and make it have ModuleNotFoundError side effect?

Generally, I think this is great - it also fits with our medium-term goal of moving to zarr overall (so I don't mind depending on zarr).

My understanding is that the loading functions are meant for BrainGlobe libraries only (ie they are not meant to be a general purpose image reader like bioio), so we would probably be open to refactoring the downstream use cases if we decide the performance improvement is worth the developer time? To discuss at tomorrow's developer meeting.

I guess a "light" version of these changes could just be for read_z_stack limiting the improvement to cellfinder where it's most needed?

matham · 2025-02-20T19:02:23Z

I can keep the changes just to read_z_stack, if we want to keep the other to numpy until full zarr is supported.

alessandrofelder · 2025-02-21T11:28:40Z

I can keep the changes just to read_z_stack, if we want to keep the other to numpy until full zarr is supported.

Yea, I think that's the best way forward for this. Thanks @matham

Enable memory mapping of 3d tiff file where possible.

2b45000

Add zarr support for lazy loading.

7e898d8

alessandrofelder self-requested a review February 17, 2025 11:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable memory mapping of 3d tiff files where possible #116

Enable memory mapping of 3d tiff files where possible #116

matham commented Feb 13, 2025

codecov bot commented Feb 13, 2025 •

edited

Loading

matham commented Feb 13, 2025

IgorTatarnikov commented Feb 14, 2025

alessandrofelder commented Feb 19, 2025 •

edited

Loading

matham commented Feb 20, 2025

alessandrofelder commented Feb 21, 2025

Enable memory mapping of 3d tiff files where possible #116

Are you sure you want to change the base?

Enable memory mapping of 3d tiff files where possible #116

Conversation

matham commented Feb 13, 2025

Description

References

How has this PR been tested?

Is this a breaking change?

Does this PR require an update to the documentation?

Checklist:

codecov bot commented Feb 13, 2025 • edited Loading

Codecov Report

matham commented Feb 13, 2025

IgorTatarnikov commented Feb 14, 2025

alessandrofelder commented Feb 19, 2025 • edited Loading

matham commented Feb 20, 2025

alessandrofelder commented Feb 21, 2025

codecov bot commented Feb 13, 2025 •

edited

Loading

alessandrofelder commented Feb 19, 2025 •

edited

Loading