Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix performance bug in buildLocationList #109343

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tmsri
Copy link
Member

@tmsri tmsri commented Sep 19, 2024

In buildLocationList, with basic block sections, we iterate over
every basic block twice to detect section start and end. This is
sub-optimal and shows up as significantly time consuming when
compiling large functions.

This patch uses the set of sections already stored in MBBSectionRanges
and iterates over sections rather than basic blocks.

When detecting if loclists can be merged, the end label of an entry is
matched with the beginning label of the next entry. For the section
corresponding to the entry basic block, this is skipped. This is
because the loc list uses the end label corresponding to the function
whereas the MBBSectionRanges map uses the function end label.

For example:

.Lfunc_begin0:
.file
.loc 0 4 0 # ex2.cc:4:0
.cfi_startproc
.Ltmp0:
.loc 0 8 5 prologue_end # ex2.cc:8:5
....
.LBB_END0_0:
.cfi_endproc
.section .text._Z4testv,"ax",@progbits,unique,1
...
.Lfunc_end0:
.size _Z4testv, .Lfunc_end0-_Z4testv

The debug loc uses ".LBB_END0_0" for the end of the section whereas
MBBSectionRanges uses ".Lfunc_end0".

It is alright to skip this as we already check the section corresponding
to the debugloc entry.

Added a new test case to check that if this works correctly when the
variable's value is mutated in the entry section.

In buildLocationList, with basic block sections, we iterate over
every basic block twice to detect section start and end. This is
sub-optimal and shows up as significantly time consuming when
compiling large functions.

This patch uses the set of sections already stored in MBBSectionRanges
and iterates over sections rather than basic blocks.

When detecting if loclists can be merged, the end label of an entry is
matched with the beginning label of the next entry. For the section
corresponding to the entry basic block, this is skipped. This is
because the loc list uses the end label corresponding to the function
whereas the MBBSectionRanges map uses the function end label.

For example:

.Lfunc_begin0:
.file
.loc 0 4 0 # ex2.cc:4:0
.cfi_startproc
.Ltmp0:
.loc 0 8 5 prologue_end # ex2.cc:8:5
....
.LBB_END0_0:
.cfi_endproc
.section .text._Z4testv,"ax",@progbits,unique,1
...
.Lfunc_end0:
.size _Z4testv, .Lfunc_end0-_Z4testv

The debug loc uses ".LBB_END0_0" for the end of the section whereas
MBBSectionRanges uses ".Lfunc_end0".

It is alright to skip this as we already check the section corresponding
to the debugloc entry.

Added a new test case to check that if this works correctly when the
variable's value is mutated in the entry section.
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 19, 2024

@llvm/pr-subscribers-debuginfo

Author: Sriraman Tallam (tmsri)

Changes

In buildLocationList, with basic block sections, we iterate over
every basic block twice to detect section start and end. This is
sub-optimal and shows up as significantly time consuming when
compiling large functions.

This patch uses the set of sections already stored in MBBSectionRanges
and iterates over sections rather than basic blocks.

When detecting if loclists can be merged, the end label of an entry is
matched with the beginning label of the next entry. For the section
corresponding to the entry basic block, this is skipped. This is
because the loc list uses the end label corresponding to the function
whereas the MBBSectionRanges map uses the function end label.

For example:

.Lfunc_begin0:
.file
.loc 0 4 0 # ex2.cc:4:0
.cfi_startproc
.Ltmp0:
.loc 0 8 5 prologue_end # ex2.cc:8:5
....
.LBB_END0_0:
.cfi_endproc
.section .text._Z4testv,"ax",@progbits,unique,1
...
.Lfunc_end0:
.size _Z4testv, .Lfunc_end0-_Z4testv

The debug loc uses ".LBB_END0_0" for the end of the section whereas
MBBSectionRanges uses ".Lfunc_end0".

It is alright to skip this as we already check the section corresponding
to the debugloc entry.

Added a new test case to check that if this works correctly when the
variable's value is mutated in the entry section.


Full diff: https://github.com/llvm/llvm-project/pull/109343.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (+8-5)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+20-18)
  • (added) llvm/test/DebugInfo/X86/basic-block-sections-debug-loclist-6.ll (+92)
  • (modified) llvm/test/DebugInfo/X86/basic-block-sections_1.ll (+3-3)
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index db7adfd3b21e5f..4c5afaa5132755 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -1737,6 +1737,12 @@ void AsmPrinter::emitFunctionBody() {
   bool IsEHa = MMI->getModule()->getModuleFlag("eh-asynch");
 
   bool CanDoExtraAnalysis = ORE->allowExtraAnalysis(DEBUG_TYPE);
+  // Create a slot for the entry basic block section so that the section
+  // order is preserved when iterating over MBBSectionRanges.
+  if (!MF->empty())
+    MBBSectionRanges[MF->front().getSectionID()] =
+        MBBSectionRange{CurrentFnBegin, nullptr};
+
   for (auto &MBB : *MF) {
     // Print a label for the basic block.
     emitBasicBlockStart(MBB);
@@ -2000,11 +2006,8 @@ void AsmPrinter::emitFunctionBody() {
   }
   for (auto &Handler : Handlers)
     Handler->markFunctionEnd();
-
-  assert(!MBBSectionRanges.contains(MF->front().getSectionID()) &&
-         "Overwrite section range");
-  MBBSectionRanges[MF->front().getSectionID()] =
-      MBBSectionRange{CurrentFnBegin, CurrentFnEnd};
+  // Update the end label of the entry block's section.
+  MBBSectionRanges[MF->front().getSectionID()].EndLabel = CurrentFnEnd;
 
   // Print out jump tables referenced by the function.
   emitJumpTableInfo();
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index e9649f9ff81658..890deb319ea73e 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -34,6 +34,7 @@
 #include "llvm/DebugInfo/DWARF/DWARFDataExtractor.h"
 #include "llvm/DebugInfo/DWARF/DWARFExpression.h"
 #include "llvm/IR/Constants.h"
+#include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/IR/Function.h"
 #include "llvm/IR/GlobalVariable.h"
 #include "llvm/IR/Module.h"
@@ -1772,18 +1773,14 @@ bool DwarfDebug::buildLocationList(SmallVectorImpl<DebugLocEntry> &DebugLoc,
     // span each individual section in the range from StartLabel to EndLabel.
     if (Asm->MF->hasBBSections() && StartLabel == Asm->getFunctionBegin() &&
         !Instr->getParent()->sameSection(&Asm->MF->front())) {
-      const MCSymbol *BeginSectionLabel = StartLabel;
-
-      for (const MachineBasicBlock &MBB : *Asm->MF) {
-        if (MBB.isBeginSection() && &MBB != &Asm->MF->front())
-          BeginSectionLabel = MBB.getSymbol();
-
-        if (MBB.sameSection(Instr->getParent())) {
-          DebugLoc.emplace_back(BeginSectionLabel, EndLabel, Values);
+      for (const auto &[MBBSectionId, MBBSectionRange] :
+           Asm->MBBSectionRanges) {
+        if (Instr->getParent()->getSectionID() == MBBSectionId) {
+          DebugLoc.emplace_back(MBBSectionRange.BeginLabel, EndLabel, Values);
           break;
         }
-        if (MBB.isEndSection())
-          DebugLoc.emplace_back(BeginSectionLabel, MBB.getEndSymbol(), Values);
+        DebugLoc.emplace_back(MBBSectionRange.BeginLabel,
+                              MBBSectionRange.EndLabel, Values);
       }
     } else {
       DebugLoc.emplace_back(StartLabel, EndLabel, Values);
@@ -1824,22 +1821,27 @@ bool DwarfDebug::buildLocationList(SmallVectorImpl<DebugLocEntry> &DebugLoc,
     RangeMBB = &Asm->MF->front();
   else
     RangeMBB = Entries.begin()->getInstr()->getParent();
+  auto RangeIt = Asm->MBBSectionRanges.find(RangeMBB->getSectionID());
+  assert(RangeIt != Asm->MBBSectionRanges.end() &&
+         "Range MBB not found in MBBSectionRanges!");
   auto *CurEntry = DebugLoc.begin();
   auto *NextEntry = std::next(CurEntry);
+  auto NextRangeIt = std::next(RangeIt);
   while (NextEntry != DebugLoc.end()) {
-    // Get the last machine basic block of this section.
-    while (!RangeMBB->isEndSection())
-      RangeMBB = RangeMBB->getNextNode();
-    if (!RangeMBB->getNextNode())
+    if (NextRangeIt == Asm->MBBSectionRanges.end())
       return false;
     // CurEntry should end the current section and NextEntry should start
     // the next section and the Values must match for these two ranges to be
-    // merged.
-    if (CurEntry->getEndSym() != RangeMBB->getEndSymbol() ||
-        NextEntry->getBeginSym() != RangeMBB->getNextNode()->getSymbol() ||
+    // merged.  Do not match the section label end if it is the entry block
+    // section.  This is because the end label for the Debug Loc and the
+    // Function end label could be different.
+    if ((RangeIt->second.EndLabel != Asm->getFunctionEnd() &&
+         CurEntry->getEndSym() != RangeIt->second.EndLabel) ||
+        NextEntry->getBeginSym() != NextRangeIt->second.BeginLabel ||
         CurEntry->getValues() != NextEntry->getValues())
       return false;
-    RangeMBB = RangeMBB->getNextNode();
+    RangeIt = NextRangeIt;
+    NextRangeIt = std::next(RangeIt);
     CurEntry = NextEntry;
     NextEntry = std::next(CurEntry);
   }
diff --git a/llvm/test/DebugInfo/X86/basic-block-sections-debug-loclist-6.ll b/llvm/test/DebugInfo/X86/basic-block-sections-debug-loclist-6.ll
new file mode 100644
index 00000000000000..8c8eef68b2ec0c
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/basic-block-sections-debug-loclist-6.ll
@@ -0,0 +1,92 @@
+; RUN: llc %s -mtriple=x86_64-unknown-linux-gnu --dwarf-version=4 --basic-block-sections=none -filetype=obj -o - | llvm-dwarfdump - | FileCheck %s
+; RUN: llc %s -mtriple=x86_64-unknown-linux-gnu --dwarf-version=4 --basic-block-sections=all -filetype=obj -o - | llvm-dwarfdump - | FileCheck --check-prefix=SECTIONS %s
+; RUN: llc %s -mtriple=x86_64-unknown-linux-gnu --dwarf-version=5 --basic-block-sections=none -filetype=obj -o - | llvm-dwarfdump - | FileCheck %s
+; RUN: llc %s -mtriple=x86_64-unknown-linux-gnu --dwarf-version=5 --basic-block-sections=all -filetype=obj -o - | llvm-dwarfdump - | FileCheck --check-prefix=SECTIONS %s
+
+; CHECK:      DW_TAG_variable
+; CHECK-NEXT: DW_AT_location
+; CHECK-NEXT: [0x{{[0-9a-f]+}}, 0x{{[0-9a-f]+}}): DW_OP_consts +7, DW_OP_stack_value
+; CHECK-NEXT: [0x{{[0-9a-f]+}}, 0x{{[0-9a-f]+}}): DW_OP_consts +8, DW_OP_stack_value
+; CHECK-NEXT: DW_AT_name	("i")
+
+; SECTIONS:      DW_TAG_variable
+; SECTIONS-NEXT: DW_AT_location
+; SECTIONS-NEXT: [0x{{[0-9a-f]+}}, 0x{{[0-9a-f]+}}): DW_OP_consts +7, DW_OP_stack_value
+; SECTIONS-NEXT: [0x{{[0-9a-f]+}}, 0x{{[0-9a-f]+}}): DW_OP_consts +8, DW_OP_stack_value
+; SECTIONS-NEXT: [0x{{[0-9a-f]+}}, 0x{{[0-9a-f]+}}): DW_OP_consts +8, DW_OP_stack_value
+; SECTIONS-NEXT: [0x{{[0-9a-f]+}}, 0x{{[0-9a-f]+}}): DW_OP_consts +8, DW_OP_stack_value
+; SECTIONS-NEXT: DW_AT_name	("i")
+
+; Source to generate the IR below:
+; void f1();
+; extern bool b;
+; void test() {
+;     // i is not a const throughout the whole scope and should
+;     // not use DW_AT_const_value
+;     int i = 7;
+;     f1();
+;     i = 8;
+;     if (b)
+;       f1();
+; }
+; $ clang++ -S loclist_section.cc -O2 -g  -emit-llvm
+
+@b = external local_unnamed_addr global i8, align 1
+
+; Function Attrs: mustprogress uwtable
+define dso_local void @_Z4testv() local_unnamed_addr #0 !dbg !10 {
+entry:
+    #dbg_value(i32 7, !14, !DIExpression(), !16)
+  tail call void @_Z2f1v(), !dbg !17
+    #dbg_value(i32 8, !14, !DIExpression(), !16)
+  %0 = load i8, ptr @b, align 1, !dbg !18, !tbaa !20, !range !24, !noundef !25
+  %loadedv = trunc nuw i8 %0 to i1, !dbg !18
+  br i1 %loadedv, label %if.then, label %if.end, !dbg !26
+
+if.then:                                          ; preds = %entry
+  tail call void @_Z2f1v(), !dbg !27
+  br label %if.end, !dbg !27
+
+if.end:                                           ; preds = %if.then, %entry
+  ret void, !dbg !28
+}
+
+declare !dbg !29 void @_Z2f1v() local_unnamed_addr #1
+
+attributes #0 = { mustprogress uwtable "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5, !6, !7, !8}
+!llvm.ident = !{!9}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, producer: "clang version 20.0.0git ([email protected]:)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "loclist_section.cc", directory: "Examples/debug_loc", checksumkind: CSK_MD5, checksum: "67769a94389681c8a6da481e2f358abb")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{i32 8, !"PIC Level", i32 2}
+!6 = !{i32 7, !"PIE Level", i32 2}
+!7 = !{i32 7, !"uwtable", i32 2}
+!8 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
+!9 = !{!"clang version 20.0.0git ([email protected]:.../llvm-project.git 7c3256280a78b0505ae4d43985c4d3239451a151)"}
+!10 = distinct !DISubprogram(name: "test", linkageName: "_Z4testv", scope: !1, file: !1, line: 3, type: !11, scopeLine: 3, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !13)
+!11 = !DISubroutineType(types: !12)
+!12 = !{null}
+!13 = !{!14}
+!14 = !DILocalVariable(name: "i", scope: !10, file: !1, line: 6, type: !15)
+!15 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!16 = !DILocation(line: 0, scope: !10)
+!17 = !DILocation(line: 7, column: 5, scope: !10)
+!18 = !DILocation(line: 9, column: 9, scope: !19)
+!19 = distinct !DILexicalBlock(scope: !10, file: !1, line: 9, column: 9)
+!20 = !{!21, !21, i64 0}
+!21 = !{!"bool", !22, i64 0}
+!22 = !{!"omnipotent char", !23, i64 0}
+!23 = !{!"Simple C++ TBAA"}
+!24 = !{i8 0, i8 2}
+!25 = !{}
+!26 = !DILocation(line: 9, column: 9, scope: !10)
+!27 = !DILocation(line: 10, column: 7, scope: !19)
+!28 = !DILocation(line: 11, column: 1, scope: !10)
+!29 = !DISubprogram(name: "f1", linkageName: "_Z2f1v", scope: !1, file: !1, line: 1, type: !11, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized)
diff --git a/llvm/test/DebugInfo/X86/basic-block-sections_1.ll b/llvm/test/DebugInfo/X86/basic-block-sections_1.ll
index 12b60c4dc321bd..c90d715142ec8b 100644
--- a/llvm/test/DebugInfo/X86/basic-block-sections_1.ll
+++ b/llvm/test/DebugInfo/X86/basic-block-sections_1.ll
@@ -16,10 +16,10 @@
 ; NO-SECTIONS: DW_AT_high_pc [DW_FORM_data4] ({{.*}})
 ; BB-SECTIONS: DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000)
 ; BB-SECTIONS-NEXT: DW_AT_ranges [DW_FORM_sec_offset]
+; BB-SECTIONS-NEXT: [{{.*}}) ".text.hot._Z3fooi"
 ; BB-SECTIONS-NEXT: [{{.*}}) ".text.hot._Z3fooi._Z3fooi.__part.1"
 ; BB-SECTIONS-NEXT: [{{.*}}) ".text.hot._Z3fooi._Z3fooi.__part.2"
 ; BB-SECTIONS-NEXT: [{{.*}}) ".text.hot._Z3fooi._Z3fooi.__part.3"
-; BB-SECTIONS-NEXT: [{{.*}}) ".text.hot._Z3fooi"
 ; BB-SECTIONS-ASM: _Z3fooi:
 ; BB-SECTIONS-ASM: .Ltmp{{[0-9]+}}:
 ; BB-SECTIONS-ASM-NEXT: .loc 1 2 9 prologue_end
@@ -36,14 +36,14 @@
 ; BB-SECTIONS-ASM: .size	_Z3fooi.__part.3, .LBB_END0_{{[0-9]+}}-_Z3fooi.__part.3
 ; BB-SECTIONS-ASM: .Lfunc_end0:
 ; BB-SECTIONS-ASM: .Ldebug_ranges0:
+; BB-SECTIONS-ASM-NEXT:	.quad	.Lfunc_begin0
+; BB-SECTIONS-ASM-NEXT:	.quad	.Lfunc_end0
 ; BB-SECTIONS-ASM-NEXT:	.quad	_Z3fooi.__part.1
 ; BB-SECTIONS-ASM-NEXT:	.quad	.LBB_END0_{{[0-9]+}}
 ; BB-SECTIONS-ASM-NEXT:	.quad	_Z3fooi.__part.2
 ; BB-SECTIONS-ASM-NEXT:	.quad	.LBB_END0_{{[0-9]+}}
 ; BB-SECTIONS-ASM-NEXT:	.quad	_Z3fooi.__part.3
 ; BB-SECTIONS-ASM-NEXT:	.quad	.LBB_END0_{{[0-9]+}}
-; BB-SECTIONS-ASM-NEXT:	.quad	.Lfunc_begin0
-; BB-SECTIONS-ASM-NEXT:	.quad	.Lfunc_end0
 ; BB-SECTIONS-ASM-NEXT:	.quad	0
 ; BB-SECTIONS-ASM-NEXT:	.quad	0
 ; BB-SECTIONS-LINE-TABLE:      0x0000000000000000 1 0 1 0 0 0 is_stmt

@tmsri
Copy link
Member Author

tmsri commented Sep 19, 2024

Continuation from PR 108886 I added a test case to address jmorse's recent comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants