Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DebugInfo] Correct the line attribution for IF branches #108300

Merged
merged 1 commit into from
Sep 23, 2024

Conversation

pogo59
Copy link
Collaborator

@pogo59 pogo59 commented Sep 11, 2024

An 'if' statement introduces a scope, but in some cases the conditional branch to the then/else blocks had a debug-info attribution that did not include the scope. This led to some inefficiency in the DWARF line table.

An 'if' statement introduces a scope, but in some cases the
conditional branch to the then/else blocks had a debug-info
attribution that did not include the scope. This led to some
inefficiency in the DWARF line table.
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Sep 11, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 11, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-clang-codegen

Author: Paul T Robinson (pogo59)

Changes

An 'if' statement introduces a scope, but in some cases the conditional branch to the then/else blocks had a debug-info attribution that did not include the scope. This led to some inefficiency in the DWARF line table.


Full diff: https://github.com/llvm/llvm-project/pull/108300.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGStmt.cpp (+1)
  • (added) clang/test/CodeGenCXX/debug-info-line-if-2.cpp (+45)
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index b138c87a853495..2fae4cf666c6b9 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -815,6 +815,7 @@ void CodeGenFunction::EmitIfStmt(const IfStmt &S) {
   // C99 6.8.4.1: The first substatement is executed if the expression compares
   // unequal to 0.  The condition must be a scalar type.
   LexicalScope ConditionScope(*this, S.getCond()->getSourceRange());
+  ApplyDebugLocation DL(*this, S.getCond());
 
   if (S.getInit())
     EmitStmt(S.getInit());
diff --git a/clang/test/CodeGenCXX/debug-info-line-if-2.cpp b/clang/test/CodeGenCXX/debug-info-line-if-2.cpp
new file mode 100644
index 00000000000000..8ab96a7daf4c47
--- /dev/null
+++ b/clang/test/CodeGenCXX/debug-info-line-if-2.cpp
@@ -0,0 +1,45 @@
+// RUN: %clang_cc1 -debug-info-kind=limited -gno-column-info -triple=x86_64-pc-linux -emit-llvm %s -o - | FileCheck  %s
+
+// The important thing is that the compare and the conditional branch have
+// locs with the same scope (the lexical block for the 'if'). By turning off
+// column info, they end up with the same !dbg record, which halves the number
+// of checks to verify the scope.
+
+int c = 2;
+
+int f() {
+#line 100
+  if (int a = 5; a > c)
+    return 1;
+  return 0;
+}
+// CHECK-LABEL: define {{.*}} @_Z1fv()
+// CHECK:       = icmp {{.*}} !dbg [[F_CMP:![0-9]+]]
+// CHECK-NEXT:  br i1 {{.*}} !dbg [[F_CMP]]
+
+int g() {
+#line 200
+  if (int a = f())
+    return 2;
+  return 3;
+}
+// CHECK-LABEL: define {{.*}} @_Z1gv()
+// CHECK:       = icmp {{.*}} !dbg [[G_CMP:![0-9]+]]
+// CHECK-NEXT:  br i1 {{.*}} !dbg [[G_CMP]]
+
+int h() {
+#line 300
+  if (c > 3)
+    return 4;
+  return 5;
+}
+// CHECK-LABEL: define {{.*}} @_Z1hv()
+// CHECK:       = icmp {{.*}} !dbg [[H_CMP:![0-9]+]]
+// CHECK-NEXT:  br i1 {{.*}} !dbg [[H_CMP]]
+
+// CHECK-DAG: [[F_CMP]] = !DILocation(line: 100, scope: [[F_SCOPE:![0-9]+]]
+// CHECK-DAG: [[F_SCOPE]] = distinct !DILexicalBlock({{.*}} line: 100)
+// CHECK-DAG: [[G_CMP]] = !DILocation(line: 200, scope: [[G_SCOPE:![0-9]+]]
+// CHECK-DAG: [[G_SCOPE]] = distinct !DILexicalBlock({{.*}} line: 200)
+// CHECK-DAG: [[H_CMP]] = !DILocation(line: 300, scope: [[H_SCOPE:![0-9]+]]
+// CHECK-DAG: [[H_SCOPE]] = distinct !DILexicalBlock({{.*}} line: 300)

@pogo59
Copy link
Collaborator Author

pogo59 commented Sep 11, 2024

}
// CHECK-LABEL: define {{.*}} @_Z1fv()
// CHECK: = icmp {{.*}} !dbg [[F_CMP:![0-9]+]]
// CHECK-NEXT: br i1 {{.*}} !dbg [[F_CMP]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this case ^ already works (so far as I can tell) without this patch, but the others don't? Because we introduce a scope for the variable declaration in a way that we don't in the (cond) case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. That's because EmitIfStmt calls EmitStmt on the initialization statement, which calls EmitStopPoint, which creates a DILocation with the correct scope; this implicitly applies to the rest of the 'if' statement. For (int a = foo()) case, it just issues a declaration for the variable and then proceeds with a normal expression evaluation; the declaration doesn't call EmitStopPoint.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, that EmitStopPoint seems a bit unstable/unreliable - the scoped location handling is designed to be more robust to ensure locations don't "leak out" beyond where they're meant to apply...

I think maybe EmitStopPoint should be removed/reconsidered, but that's perhaps beyond the scope (har har) of this issue - but thoughts in case anyone else feels like picking up and running with that.

How's this location compare to other control structures (loops, etc) - do we (& GCC) use the condition as the location for the branch instructions, or would it be more suitable to use the start of if itself?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

& I guess this all only applies when the control structure doesn't have {}? (that being why the test doesn't have braces?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed this where my test case led me. You're right it's worth looking at other control structures that have implied lexical blocks.

& I guess this all only applies when the control structure doesn't have {}? (that being why the test doesn't have braces?)

No, this applies in all cases.

if (int a = 5; b > c) {
    return a;
} else {
    return a + 1;
}

The implied lexical block started by the if statement spans its then and else parts; whether those parts are simple or compound doesn't matter. Analogous to how a function definition has an implied lexical block at the opening paren of the parameter list; the function name is visible in the containing scope, the parameter names are not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. other statements: while does not appear to have a problem, however for does. Looking at the implementation, I would have expected it to crop up only when the first for clause is empty, but in fact it happens regardless. It conjures up not one but two DILexicalBlocks, both pointing to the for keyword. I'd prefer to look into that more deeply as a separate task.

Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough

@pogo59 pogo59 merged commit 53abbce into llvm:main Sep 23, 2024
9 of 12 checks passed
@pogo59
Copy link
Collaborator Author

pogo59 commented Sep 23, 2024

Wow, better than I expected, considering that .debug_line is not one of the larger sections.

augusto2112 pushed a commit to augusto2112/llvm-project that referenced this pull request Sep 26, 2024
An 'if' statement introduces a scope, but in some cases the conditional
branch to the then/else blocks had a debug-info attribution that did not
include the scope. This led to some inefficiency in the DWARF line
table.
@pogo59
Copy link
Collaborator Author

pogo59 commented Sep 27, 2024

Filed #110313 as the followup for looking at the for statement.

xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Oct 4, 2024
An 'if' statement introduces a scope, but in some cases the conditional
branch to the then/else blocks had a debug-info attribution that did not
include the scope. This led to some inefficiency in the DWARF line
table.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category debuginfo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants