Stencil Construction Profiling - Speed up with GPU? #50
JanGaertner
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Profiling Stencil Collection
The stencil collection process during the construction of the WENOBase class is quite expensive and takes a lot of time. Therefore, to improve start-up times the construction of the WENOBase class is profiled with the AMDuProf version 3.5.671.0 in single core on a AMD Ryzen 5 PRO 3500U.
The hottest function is the calculation of the Gauss quadrature in
WENOEXT/libWENOEXT/WENOBase/geometryWENO/geometryWENO.C
Line 346 in f45593a
Even though it is already attempted to improve performance using manual AVX instructions, the calculation of the power still takes a long time. That the vector operations are the ones taking up most of the time can also be seen in the assembly code analysis of the profiler.
data:image/s3,"s3://crabby-images/55495/55495a8ed813bf541540e5d3b9d60cb575c20f26" alt="singleCoreBuildUp"
Possibly the calculation of the quadrature can be reformulated as a single matrix multiplication or as multiple matrix multiplication which could be executed on a GPU to improve performance.
Speed up with GPU
Beta Was this translation helpful? Give feedback.
All reactions