Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Adding intrinsic implementations for LSX/LASX targets. #2477

Open
jinboson opened this issue Feb 20, 2025 · 4 comments
Open

WIP: Adding intrinsic implementations for LSX/LASX targets. #2477

jinboson opened this issue Feb 20, 2025 · 4 comments

Comments

@jinboson
Copy link

https://github.com/google/highway/blob/aba14629c2aaff4f842d7ec94c964b9a1f4aff39/hwy/ops/x86_256-inl.h#L267C12-L267C21

It seems that VFromD has not been defined yet around line 267, so why can it be used here ? Or I missed something ?

@jinboson jinboson changed the title A litttle confused about VFromD here A bit confused about VFromD here Feb 20, 2025
@jinboson jinboson changed the title A bit confused about VFromD here WIP: Adding intrinsic implementations for LSX/LASX targets. Feb 20, 2025
@jan-wassenberg
Copy link
Member

Hi, VFromD is actually defined in x86_128-inl.h, which is included from the x86_256-inl.h and thus available before that point.

FYI we do sometimes have bugs where one compiler is more tolerant of "use before definition", which we fix by reordering. This is why some ops have their dependencies in parentheses, like "// ------------------------------ LoadMaskBits (TestBit)", so we can more easily see this.

HTH?

@jinboson
Copy link
Author

Hi, VFromD is actually defined in x86_128-inl.h, which is included from the x86_256-inl.h and thus available before that point.

Yes, I find it in x86_128-inl.h,but line 267 seems need a Vec256<TFromD> return type right ? There seems a mismatch problem in the size of return type. That' my confunsed point.

copybara-service bot pushed a commit that referenced this issue Feb 20, 2025
copybara-service bot pushed a commit that referenced this issue Feb 20, 2025
@jan-wassenberg
Copy link
Member

Ah, I see. This seems permissible due to the two-phase lookup of templates: the Zero overload is only required to be present at the point of instantiation.
But there is no harm in moving the 256-bit Zero overloads before their usage, we can do that. Thanks for pointing this out.

copybara-service bot pushed a commit that referenced this issue Feb 20, 2025
copybara-service bot pushed a commit that referenced this issue Feb 20, 2025
@jinboson
Copy link
Author

Really appreciate your help in understanding this. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants