Parse `architecture` from `PyTorch` instead of hard coding #2995

whitneywhtsang · 2024-12-11T17:35:09Z

With pytorch/pytorch#138186, architecture is added to XPU device property.
Instead of hard coding pvc when invoking ocloc, this PR changed to dynamically passing the device architecture parsed.

Signed-off-by: Whitney Tsang <[email protected]>

third_party/intel/backend/compiler.py

Signed-off-by: Whitney Tsang <[email protected]>

third_party/intel/backend/compiler.py

Signed-off-by: Whitney Tsang <[email protected]>

alexbaden

LGTM, thanks!

pbchekin · 2024-12-11T22:29:06Z

Looks like this change makes CI slower (~2h vs ~1.5h).

Signed-off-by: Whitney Tsang <[email protected]>

…ged" This reverts commit 5e6d2c9.

Signed-off-by: Whitney Tsang <[email protected]>

whitneywhtsang · 2024-12-12T00:39:33Z

Looks like this change makes CI slower (~2h vs ~1.5h).

With 5e6d2c9, we verified that the supported extensions are unchanged, so the increase in CI time should be due to ocloc invocation, 89aa35d attempts to reduce the number of invocation by half.

chengjunlu · 2024-12-12T01:11:38Z

Looks like this change makes CI slower (~2h vs ~1.5h).

With 5e6d2c9, we verified that the supported extensions are unchanged, so the increase in CI time should be due to ocloc invocation, 89aa35d attempts to reduce the number of invocation by half.

Maybe we can cache the hardware capability to reduce the number call of ocloc further.

whitneywhtsang · 2024-12-12T02:17:02Z

Looks like this change makes CI slower (~2h vs ~1.5h).

With 5e6d2c9, we verified that the supported extensions are unchanged, so the increase in CI time should be due to ocloc invocation, 89aa35d attempts to reduce the number of invocation by half.

Maybe we can cache the hardware capability to reduce the number call of ocloc further.

New CI time is 1h 36m 48s, let's see how much more caching can reduce.

Signed-off-by: Whitney Tsang <[email protected]>

whitneywhtsang · 2024-12-12T03:54:21Z

Looks like this change makes CI slower (~2h vs ~1.5h).

With 5e6d2c9, we verified that the supported extensions are unchanged, so the increase in CI time should be due to ocloc invocation, 89aa35d attempts to reduce the number of invocation by half.

Maybe we can cache the hardware capability to reduce the number call of ocloc further.

New CI time is 1h 36m 48s, let's see if cache can reduce more in a different PR.

With caching, further reduced to 1h 25m 26s.

Parse architecture from PyTorch instead of hard coding

686a499

Signed-off-by: Whitney Tsang <[email protected]>

whitneywhtsang requested review from pbchekin, alexbaden, etiotto, chengjunlu and quintinwang5 December 11, 2024 17:35

whitneywhtsang self-assigned this Dec 11, 2024

whitneywhtsang linked an issue Dec 11, 2024 that may be closed by this pull request

Determine appropriate device architecture during compile stage #2493

Closed

pbchekin reviewed Dec 11, 2024

View reviewed changes

third_party/intel/backend/compiler.py Outdated Show resolved Hide resolved

whitneywhtsang requested a review from pbchekin December 11, 2024 17:53

address review comment

a853d16

Signed-off-by: Whitney Tsang <[email protected]>

whitneywhtsang force-pushed the whitneywhtsang/device_arch branch from aa8c13c to a853d16 Compare December 11, 2024 18:37

Fix LTS failure

d378a2b

Signed-off-by: Whitney Tsang <[email protected]>

whitneywhtsang force-pushed the whitneywhtsang/device_arch branch from 4e1e895 to d378a2b Compare December 11, 2024 19:32

Add support for more archs

47a3216

Signed-off-by: Whitney Tsang <[email protected]>

whitneywhtsang force-pushed the whitneywhtsang/device_arch branch from 37b6217 to 47a3216 Compare December 11, 2024 20:01

pbchekin reviewed Dec 11, 2024

View reviewed changes

third_party/intel/backend/compiler.py Outdated Show resolved Hide resolved

pbchekin approved these changes Dec 11, 2024

View reviewed changes

address review comment

219b490

Signed-off-by: Whitney Tsang <[email protected]>

alexbaden approved these changes Dec 11, 2024

View reviewed changes

whitneywhtsang added 3 commits December 11, 2024 22:43

Add temporary checks to ensure device properties are not changed

5e6d2c9

Signed-off-by: Whitney Tsang <[email protected]>

Revert "Add temporary checks to ensure device properties are not chan…

acf42a8

…ged" This reverts commit 5e6d2c9.

Reduce number of ocloc invocation

89aa35d

Signed-off-by: Whitney Tsang <[email protected]>

chengjunlu approved these changes Dec 12, 2024

View reviewed changes

cache ocloc result

3673195

Signed-off-by: Whitney Tsang <[email protected]>

whitneywhtsang merged commit b70c7f7 into main Dec 12, 2024
5 checks passed

whitneywhtsang deleted the whitneywhtsang/device_arch branch December 12, 2024 03:54

whitneywhtsang mentioned this pull request Dec 16, 2024

Query device architecture and feature flags in Triton #1900

Closed

whitneywhtsang linked an issue Dec 16, 2024 that may be closed by this pull request

Query device architecture and feature flags in Triton #1900

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse `architecture` from `PyTorch` instead of hard coding #2995

Parse `architecture` from `PyTorch` instead of hard coding #2995

whitneywhtsang commented Dec 11, 2024

alexbaden left a comment

pbchekin commented Dec 11, 2024

whitneywhtsang commented Dec 12, 2024

chengjunlu commented Dec 12, 2024

whitneywhtsang commented Dec 12, 2024 •

edited

Loading

whitneywhtsang commented Dec 12, 2024

Parse architecture from PyTorch instead of hard coding #2995

Parse architecture from PyTorch instead of hard coding #2995

Conversation

whitneywhtsang commented Dec 11, 2024

alexbaden left a comment

Choose a reason for hiding this comment

pbchekin commented Dec 11, 2024

whitneywhtsang commented Dec 12, 2024

chengjunlu commented Dec 12, 2024

whitneywhtsang commented Dec 12, 2024 • edited Loading

whitneywhtsang commented Dec 12, 2024

Parse `architecture` from `PyTorch` instead of hard coding #2995

Parse `architecture` from `PyTorch` instead of hard coding #2995

whitneywhtsang commented Dec 12, 2024 •

edited

Loading