Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine appropriate device architecture during compile stage #2493

Closed
alexbaden opened this issue Oct 15, 2024 · 5 comments · Fixed by #2995
Closed

Determine appropriate device architecture during compile stage #2493

alexbaden opened this issue Oct 15, 2024 · 5 comments · Fixed by #2995

Comments

@alexbaden
Copy link
Contributor

We have two new features which rely on the ocloc utility to query GPU architecture info (#1900) and compile native GPU code (#1792) during the compile stage. Ocloc needs a device parameter:

 -device <device_type>                     Target device.
                                            <device_type> can be: tgl, tgllp, rkl, adl-s, rpl-s, adl-p, rpl-p, adl-n, dg1, dg2-g10-a0, dg2-g10-a1, dg2-g10-b0, acm-g10, ats-m150, dg2-g10, dg2, dg2-g10-c0, dg2-g11-a0, dg2-g11-b0, acm-g11, ats-m75, dg2-g11, dg2-g11-b1, acm-g12, dg2-g12, dg2-g12-a0
, pvc-xl-a0, pvc-sdv, pvc-xl-a0p, pvc-xt-a0, pvc-xt-b0, pvc-xt-b1, pvc, pvc-xt-c0, pvc-vg, pvc-xt-c0-vg, mtl-u-a0, arl-s, arl-u, mtl-m, mtl-s, mtl-u, mtl, mtl-u-b0, mtl-h-a0, mtl-h, mtl-p, mtl-h-b0, arl-h-a0, arl-h, arl-h-b0, bmg-g21-a0, bmg-g21-a1, bmg-g21, bmg-g21-b0, lnl-a0, lnl-a1, lnl-m, l
nl-b0, xe, xe2, gen12lp, xe-hpc, xe-hpc-vg, xe-hpg, xe-lp, xe-lpg, xe-lpgplus, xe2-hpg, xe2-lpg, ip version  or hexadecimal value with 0x prefix

We currently get a name string from PyTorch:

'name': 'Intel(R) Data Center GPU Max 1100
'name': 'Intel(R) Graphics [0xe20c]',

we can try to use the name string, or we can ask PyTorch to implement the device architecture API from oneAPI - https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/ext/oneapi/experimental/device_architecture.hpp
However this API is changing between 2024 and 2025, so we will need to be careful to use the right enum in that case (particularly if we compile against one version but use a different version at runtime).

@whitneywhtsang
Copy link
Contributor

@guangyey is going to check with the DPC++ team on whether there is a plan to move device architecture API out of the "experimental".

@EikanWang
Copy link
Contributor

@guangyey , pls. keep the issue updated.

@guangyey
Copy link

guangyey commented Nov 6, 2024

It depends on 2025.0.

@EikanWang
Copy link
Contributor

@alexbaden , @whitneywhtsang , we are upgrading PyTorch to fit 2025.0. We will submit the PR to add the feature as soon as 2025.0 in PyTorch is ready.

@guangyey
Copy link

PR is pytorch/pytorch#138186.
The compiler team replies that they have a plan to move architecture out of the experimental namespace, but they have no ETA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment