-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Made critical changes to small_gemm #568
Open
Meghana-vankadari
wants to merge
5
commits into
flame:master
Choose a base branch
from
Meghana-vankadari:gemm
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Details: - In case of GEMM, whenever beta is zero, we need to perform C = alpha *(A * B) instead of C = beta * C + alpha * (A * B) Added conditions to check the value of beta at different levels inside small_gemm kernels and decide whether to perform scaling C with beta or not. -Modified small_gemm kernels to use BLIS specific functions to retrieve different fields of objects. -Calling bli_gemm_check before entering bli_gemm_small to facilitate early return in case of invalid inputs. -For corner cases inside small_gemm kernels, a buffer called f_temp is used to load and store data to and from registers. populating the buffer with zeroes before use. -In bli_gemm_front, datatypes of status and return value from bli_gemm_small are not matching. Corrected the datatype of the variable 'status' inside bli_gemm_front to err_t. Change-Id: I8b52ad55008f028d6c8b7e0d20f746a869d9daea Signed-off-by: Meghana Vankadari <[email protected]> AMD-Internal: [CPUPL-689,SWLCSG-104]
Details: - This implementation does a transpose operation while packing 16xk of A buffer and passes it to 16x3-nn kernel. - The same implementation works for the case where B has transpose. AMD-Internal: [CPUPL-1376] Change-Id: I81f74deb609926598f62c30f5bd6fc80fb1b9a17
Details: - Decision logic to choose small_gemm has been moved to blas interface. - Redirecting all the calls to small_gemm from gemm_front to native implementation. AMD-Internal: [CPUPL-1376] Change-Id: I6490f67113e9f7c272269f441c86f2a0b3c89a53
Details: - Placed optimized version of BLAS DGEMM, ZGEMM definitions under BLIS_CONFIG_EPYC as they use gemm small which are defined only for zen family configurations. - Added code to query and set cntx in gemv and trsv framework before cntx is referred for any function pointers to avoid querying from NULL pointer. AMD-Internal: [CPUPL-1562] Change-Id: I977d028ec4ddb57dcdc70e443e7708f36c01cca9
dzambare
reviewed
Nov 5, 2021
ctype* A11; \ | ||
ctype* A21; \ | ||
ctype* a01; \ | ||
ctype* alpha11; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid whitespace changes.
kernels/zen/bli_kernels_zen.h
Outdated
|
||
|
||
// gemm square matrix size friendly implementation | ||
err_t bli_gemm_sqp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed in this PR.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Made some critical changes to small_gemm.
AMD BLIS Upstream:
This PR includes following commits for AMD BLIS version 3.0.1
ac2a50f Fixed blastest failure for haswell configuration
c597fa6 Disabled calling of bli_dgemm_small from gemm_front
1c6d455 Implemented 16x3 based gemm kernel for the case where A has transpose
ed7780d Made some critical changes to small_gemm kernels