Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r.horizon: Support parallel computing for the point mode by OpenMP #4213

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

cyliang368
Copy link
Contributor

This PR continues parallelizing the r.horizon module by OpenMP. #3890 parallelized the raster mode, while this PR tried to parallelize the point mode.

@github-actions github-actions bot added raster Related to raster data processing Python Related code is in Python C Related code is in C module labels Aug 22, 2024
@cyliang368
Copy link
Contributor Author

The point mode has much fewer computations than the raster mode. Additionally, the results are directly printed out, which is very expensive in terms of runtime. The runtime of the whole process is dominated mainly by printing out the results, but it cannot be parallelized. As a result, the efficiency does not look good if we directly run the benchmark. Overall, parallelization can save some running time but not too much. @petrasovaa, please review and check this PR. If you think we should move on, I will update the documentation.

The two figures below are overall speedup and efficiency.
runtime
efficiency

The two figures below are speedup and efficiency if we let the tool compute but don't print anything.
speedup
efficiency

@echoix
Copy link
Member

echoix commented Aug 22, 2024

A little unrelated, but did you ever hear of the tool "hyperfine" for benchmarking? It is known for doing stats and reporting outliers, and adjusting the number of runs, doing warmup, ignoring the shell startup time, parametrized sweeping, etc. It's quite useful to know if you're benchmarking in correct conditions (it will report if the noise/outliers may be caused by a busy system), and tell you what is better between these two (or more) alternatives.

https://github.com/sharkdp/hyperfine

@cyliang368
Copy link
Contributor Author

@echoix Thanks for sharing. I will check it.

@echoix
Copy link
Member

echoix commented Aug 22, 2024

Seeing the plots in the PR that had discontinuities (ups and downs) made me think of it, and how many runs each data point had (and what was the distribution of these, were there some outliers?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C Related code is in C module Python Related code is in Python raster Related to raster data processing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants