-
Notifications
You must be signed in to change notification settings - Fork 11.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BasicTTI] When costing a scalarized cast, use distinct src and dest types #109325
base: main
Are you sure you want to change the base?
Conversation
…types When we fallback on scalarizing a vector cast, BasicTTI was assuming that the type being extracted was the same as that being inserted. This is not true when scalarizing a e.g. truncate, zext, or sext. I have made no attempt to confirm the test diffs are profitable in terms of e.g. vectorization result for the various targets impacted.
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-llvm-analysis Author: Philip Reames (preames) ChangesWhen we fallback on scalarizing a vector cast, BasicTTI was assuming that the type being extracted was the same as that being inserted. This is not true when scalarizing a e.g. truncate, zext, or sext. I have made no attempt to confirm the test diffs are profitable in terms of e.g. vectorization result for the various targets impacted. Patch is 975.24 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/109325.diff 19 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index 7198e134a2d262..4cc4f6704c155d 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -1186,7 +1186,9 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
// Return the cost of multiple scalar invocation plus the cost of
// inserting and extracting the values.
- return getScalarizationOverhead(DstVTy, /*Insert*/ true, /*Extract*/ true,
+ return getScalarizationOverhead(SrcVTy, /*Insert*/ false, /*Extract*/ true,
+ CostKind) +
+ getScalarizationOverhead(DstVTy, /*Insert*/ true, /*Extract*/ false,
CostKind) +
Num * Cost;
}
diff --git a/llvm/test/Analysis/CostModel/AArch64/cast.ll b/llvm/test/Analysis/CostModel/AArch64/cast.ll
index fa778864ae978f..01e9d3483fc167 100644
--- a/llvm/test/Analysis/CostModel/AArch64/cast.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/cast.ll
@@ -937,8 +937,8 @@ define i32 @casts_no_users() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r97 = fptosi <2 x float> undef to <2 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r98 = fptoui <2 x float> undef to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r99 = fptosi <2 x float> undef to <2 x i64>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %r100 = fptoui <2 x double> undef to <2 x i1>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %r101 = fptosi <2 x double> undef to <2 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r100 = fptoui <2 x double> undef to <2 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r101 = fptosi <2 x double> undef to <2 x i1>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r102 = fptoui <2 x double> undef to <2 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r103 = fptosi <2 x double> undef to <2 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r104 = fptoui <2 x double> undef to <2 x i16>
@@ -947,8 +947,8 @@ define i32 @casts_no_users() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r107 = fptosi <2 x double> undef to <2 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r108 = fptoui <2 x double> undef to <2 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r109 = fptosi <2 x double> undef to <2 x i64>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %r110 = fptoui <4 x float> undef to <4 x i1>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %r111 = fptosi <4 x float> undef to <4 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %r110 = fptoui <4 x float> undef to <4 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %r111 = fptosi <4 x float> undef to <4 x i1>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r112 = fptoui <4 x float> undef to <4 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r113 = fptosi <4 x float> undef to <4 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r114 = fptoui <4 x float> undef to <4 x i16>
@@ -957,8 +957,8 @@ define i32 @casts_no_users() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r117 = fptosi <4 x float> undef to <4 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r118 = fptoui <4 x float> undef to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r119 = fptosi <4 x float> undef to <4 x i64>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %r120 = fptoui <4 x double> undef to <4 x i1>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %r121 = fptosi <4 x double> undef to <4 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %r120 = fptoui <4 x double> undef to <4 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %r121 = fptosi <4 x double> undef to <4 x i1>
; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r122 = fptoui <4 x double> undef to <4 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r123 = fptosi <4 x double> undef to <4 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r124 = fptoui <4 x double> undef to <4 x i16>
@@ -967,8 +967,8 @@ define i32 @casts_no_users() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r127 = fptosi <4 x double> undef to <4 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r128 = fptoui <4 x double> undef to <4 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r129 = fptosi <4 x double> undef to <4 x i64>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 41 for instruction: %r130 = fptoui <8 x float> undef to <8 x i1>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 41 for instruction: %r131 = fptosi <8 x float> undef to <8 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 37 for instruction: %r130 = fptoui <8 x float> undef to <8 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 37 for instruction: %r131 = fptosi <8 x float> undef to <8 x i1>
; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r132 = fptoui <8 x float> undef to <8 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r133 = fptosi <8 x float> undef to <8 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %r134 = fptoui <8 x float> undef to <8 x i16>
@@ -977,8 +977,8 @@ define i32 @casts_no_users() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r137 = fptosi <8 x float> undef to <8 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %r138 = fptoui <8 x float> undef to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %r139 = fptosi <8 x float> undef to <8 x i64>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 43 for instruction: %r140 = fptoui <8 x double> undef to <8 x i1>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 43 for instruction: %r141 = fptosi <8 x double> undef to <8 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 35 for instruction: %r140 = fptoui <8 x double> undef to <8 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 35 for instruction: %r141 = fptosi <8 x double> undef to <8 x i1>
; CHECK-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %r142 = fptoui <8 x double> undef to <8 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %r143 = fptosi <8 x double> undef to <8 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %r144 = fptoui <8 x double> undef to <8 x i16>
@@ -987,8 +987,8 @@ define i32 @casts_no_users() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %r147 = fptosi <8 x double> undef to <8 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r148 = fptoui <8 x double> undef to <8 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r149 = fptosi <8 x double> undef to <8 x i64>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 83 for instruction: %r150 = fptoui <16 x float> undef to <16 x i1>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 83 for instruction: %r151 = fptosi <16 x float> undef to <16 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 75 for instruction: %r150 = fptoui <16 x float> undef to <16 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 75 for instruction: %r151 = fptosi <16 x float> undef to <16 x i1>
; CHECK-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %r152 = fptoui <16 x float> undef to <16 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %r153 = fptosi <16 x float> undef to <16 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %r154 = fptoui <16 x float> undef to <16 x i16>
@@ -997,8 +997,8 @@ define i32 @casts_no_users() {
; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %r157 = fptosi <16 x float> undef to <16 x i32>
; CHECK-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %r158 = fptoui <16 x float> undef to <16 x i64>
; CHECK-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %r159 = fptosi <16 x float> undef to <16 x i64>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 87 for instruction: %r160 = fptoui <16 x double> undef to <16 x i1>
-; CHECK-NEXT: Cost Model: Found an estimated cost of 87 for instruction: %r161 = fptosi <16 x double> undef to <16 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 71 for instruction: %r160 = fptoui <16 x double> undef to <16 x i1>
+; CHECK-NEXT: Cost Model: Found an estimated cost of 71 for instruction: %r161 = fptosi <16 x double> undef to <16 x i1>
; CHECK-NEXT: Cost Model: Found an estimated cost of 23 for instruction: %r162 = fptoui <16 x double> undef to <16 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 23 for instruction: %r163 = fptosi <16 x double> undef to <16 x i8>
; CHECK-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %r164 = fptoui <16 x double> undef to <16 x i16>
@@ -1363,8 +1363,8 @@ define i32 @casts_no_users() {
; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r97 = fptosi <2 x float> undef to <2 x i32>
; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r98 = fptoui <2 x float> undef to <2 x i64>
; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r99 = fptosi <2 x float> undef to <2 x i64>
-; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %r100 = fptoui <2 x double> undef to <2 x i1>
-; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %r101 = fptosi <2 x double> undef to <2 x i1>
+; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r100 = fptoui <2 x double> undef to <2 x i1>
+; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r101 = fptosi <2 x double> undef to <2 x i1>
; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r102 = fptoui <2 x double> undef to <2 x i8>
; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r103 = fptosi <2 x double> undef to <2 x i8>
; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r104 = fptoui <2 x double> undef to <2 x i16>
@@ -1373,8 +1373,8 @@ define i32 @casts_no_users() {
; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r107 = fptosi <2 x double> undef to <2 x i32>
; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r108 = fptoui <2 x double> undef to <2 x i64>
; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r109 = fptosi <2 x double> undef to <2 x i64>
-; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %r110 = fptoui <4 x float> undef to <4 x i1>
-; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %r111 = fptosi <4 x float> undef to <4 x i1>
+; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %r110 = fptoui <4 x float> undef to <4 x i1>
+; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %r111 = fptosi <4 x float> undef to <4 x i1>
; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r112 = fptoui <4 x float> undef to <4 x i8>
; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r113 = fptosi <4 x float> undef to <4 x i8>
; FIXED-MIN-256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r114 = fptoui <4 x float> undef to <4 x i16>
@@ -1576,8 +1576,8 @@ define i32 @casts_no_users() {
; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r97 = fptosi <2 x float> undef to <2 x i32>
; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r98 = fptoui <2 x float> undef to <2 x i64>
; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r99 = fptosi <2 x float> undef to <2 x i64>
-; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %r100 = fptoui <2 x double> undef to <2 x i1>
-; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %r101 = fptosi <2 x double> undef to <2 x i1>
+; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r100 = fptoui <2 x double> undef to <2 x i1>
+; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r101 = fptosi <2 x double> undef to <2 x i1>
; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r102 = fptoui <2 x double> undef to <2 x i8>
; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r103 = fptosi <2 x double> undef to <2 x i8>
; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r104 = fptoui <2 x double> undef to <2 x i16>
@@ -1586,8 +1586,8 @@ define i32 @casts_no_users() {
; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r107 = fptosi <2 x double> undef to <2 x i32>
; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r108 = fptoui <2 x double> undef to <2 x i64>
; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r109 = fptosi <2 x double> undef to <2 x i64>
-; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %r110 = fptoui <4 x float> undef to <4 x i1>
-; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %r111 = fptosi <4 x float> undef to <4 x i1>
+; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %r110 = fptoui <4 x float> undef to <4 x i1>
+; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %r111 = fptosi <4 x float> undef to <4 x i1>
; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r112 = fptoui <4 x float> undef to <4 x i8>
; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r113 = fptosi <4 x float> undef to <4 x i8>
; FIXED-MIN-2048-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r114 = fptoui <4 x float> undef to <4 x i16>
@@ -3292,38 +3292,38 @@ define void @fp16cast() {
; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r95 = fptosi <2 x half> undef to <2 x i16>
; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r96 = fptoui <2 x half> undef to <2 x i32>
; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r97 = fptosi <2 x half> undef to <2 x i32>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %r98 = fptoui <2 x half> undef to <2 x i64>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %r99 = fptosi <2 x half> undef to <2 x i64>
+; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r98 = fptoui <2 x half> undef to <2 x i64>
+; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %r99 = fptosi <2 x half> undef to <2 x i64>
; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r110 = fptoui <4 x half> undef to <4 x i1>
; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r111 = fptosi <4 x half> undef to <4 x i1>
; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r112 = fptoui <4 x half> undef to <4 x i8>
; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r113 = fptosi <4 x half> undef to <4 x i8>
; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r114 = fptoui <4 x half> undef to <4 x i16>
; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r115 = fptosi <4 x half> undef to <4 x i16>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %r116 = fptoui <4 x half> undef to <4 x i32>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %r117 = fptosi <4 x half> undef to <4 x i32>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %r118 = fptoui <4 x half> undef to <4 x i64>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %r119 = fptosi <4 x half> undef to <4 x i64>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %r130 = fptoui <8 x half> undef to <8 x i1>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %r131 = fptosi <8 x half> undef to <8 x i1>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %r132 = fptoui <8 x half> undef to <8 x i8>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %r133 = fptosi <8 x half> undef to <8 x i8>
+; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %r116 = fptoui <4 x half> undef to <4 x i32>
+; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %r117 = fptosi <4 x half> undef to <4 x i32>
+; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %r118 = fptoui <4 x half> undef to <4 x i64>
+; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %r119 = fptosi <4 x half> undef to <4 x i64>
+; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 38 for instruction: %r130 = fptoui <8 x half> undef to <8 x i1>
+; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 38 for instruction: %r131 = fptosi <8 x half> undef to <8 x i1>
+; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 38 for instruction: %r132 = fptoui <8 x half> undef to <8 x i8>
+; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 38 for instruction: %r133 = fptosi <8 x half> undef to <8 x i8>
; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r134 = fptoui <8 x half> undef to <8 x i16>
; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r135 = fptosi <8 x half> undef to <8 x i16>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 41 for instruction: %r136 = fptoui <8 x half> undef to <8 x i32>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 41 for instruction: %r137 = fptosi <8 x half> undef to <8 x i32>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 43 for instruction: %r138 = fptoui <8 x half> undef to <8 x i64>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 43 for instruction: %r139 = fptosi <8 x half> undef to <8 x i64>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 81 for instruction: %r150 = fptoui <16 x half> undef to <16 x i1>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 81 for instruction: %r151 = fptosi <16 x half> undef to <16 x i1>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 81 for instruction: %r152 = fptoui <16 x half> undef to <16 x i8>
-; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 81 for instruction: %r153 = fptosi <16 x half> undef to <16 x i8>
+; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 37 for instruction: %r136 = fptoui <8 x half> undef to <8 x i32>
+; CHECK-NOFP16-NEXT: Cost Model: Found an estimated cost of 37 for instruction: %r137 = fptosi <8 x half> undef to <8...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
When we fallback on scalarizing a vector cast, BasicTTI was assuming that the type being extracted was the same as that being inserted. This is not true when scalarizing a e.g. truncate, zext, or sext.
I have made no attempt to confirm the test diffs are profitable in terms of e.g. vectorization result for the various targets impacted.