Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clangd] Harden incomingCalls() against possible misinterpretation of a range as pertaining to the wrong file #111616

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

HighCommander4
Copy link
Collaborator

I don't have a concrete motivating scenario here, just something I noticed during code reading:

CallHierarchyIncomingCall::fromRanges are interpreted as ranges in the same file as the CallHierarchyItem representing the caller (CallHierarchyIncomingCall::from).

In the server implementation, we populate them based on Ref::Location, taking only the range and discarding the file, and associate them with a CallHierarchyItem representing Ref::Container.

Now, logically it should be the case that the definition location of the symbol referred to by Ref::Container is in the same file as the Ref itself... but I don't think anything guarantees this, and if for some reason this doesn't hold, we are effectively taking ranges from one file and interpreting them as being in another file.

The patch adds a check for this and discards ranges which are not in fact in the same file.

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 9, 2024

@llvm/pr-subscribers-clang-tools-extra

Author: Nathan Ridge (HighCommander4)

Changes

I don't have a concrete motivating scenario here, just something I noticed during code reading:

CallHierarchyIncomingCall::fromRanges are interpreted as ranges in the same file as the CallHierarchyItem representing the caller (CallHierarchyIncomingCall::from).

In the server implementation, we populate them based on Ref::Location, taking only the range and discarding the file, and associate them with a CallHierarchyItem representing Ref::Container.

Now, logically it should be the case that the definition location of the symbol referred to by Ref::Container is in the same file as the Ref itself... but I don't think anything guarantees this, and if for some reason this doesn't hold, we are effectively taking ranges from one file and interpreting them as being in another file.

The patch adds a check for this and discards ranges which are not in fact in the same file.


Full diff: https://github.com/llvm/llvm-project/pull/111616.diff

1 Files Affected:

  • (modified) clang-tools-extra/clangd/XRefs.cpp (+24-10)
diff --git a/clang-tools-extra/clangd/XRefs.cpp b/clang-tools-extra/clangd/XRefs.cpp
index f94cadeffaa298..cca5035fb74bd4 100644
--- a/clang-tools-extra/clangd/XRefs.cpp
+++ b/clang-tools-extra/clangd/XRefs.cpp
@@ -63,6 +63,7 @@
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/Error.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/raw_ostream.h"
 #include <optional>
@@ -2272,18 +2273,14 @@ incomingCalls(const CallHierarchyItem &Item, const SymbolIndex *Index) {
   // Initially store the ranges in a map keyed by SymbolID of the caller.
   // This allows us to group different calls with the same caller
   // into the same CallHierarchyIncomingCall.
-  llvm::DenseMap<SymbolID, std::vector<Range>> CallsIn;
+  llvm::DenseMap<SymbolID, std::vector<SymbolLocation>> CallsIn;
   // We can populate the ranges based on a refs request only. As we do so, we
   // also accumulate the container IDs into a lookup request.
   LookupRequest ContainerLookup;
   Index->refs(Request, [&](const Ref &R) {
-    auto Loc = indexToLSPLocation(R.Location, Item.uri.file());
-    if (!Loc) {
-      elog("incomingCalls failed to convert location: {0}", Loc.takeError());
-      return;
-    }
-    auto It = CallsIn.try_emplace(R.Container, std::vector<Range>{}).first;
-    It->second.push_back(Loc->range);
+    auto It =
+        CallsIn.try_emplace(R.Container, std::vector<SymbolLocation>{}).first;
+    It->second.push_back(R.Location);
 
     ContainerLookup.IDs.insert(R.Container);
   });
@@ -2292,9 +2289,26 @@ incomingCalls(const CallHierarchyItem &Item, const SymbolIndex *Index) {
   Index->lookup(ContainerLookup, [&](const Symbol &Caller) {
     auto It = CallsIn.find(Caller.ID);
     assert(It != CallsIn.end());
-    if (auto CHI = symbolToCallHierarchyItem(Caller, Item.uri.file()))
+    if (auto CHI = symbolToCallHierarchyItem(Caller, Item.uri.file())) {
+      SymbolLocation CallerLoc =
+          Caller.Definition ? Caller.Definition : Caller.CanonicalDeclaration;
+      std::vector<Range> FromRanges;
+      for (const SymbolLocation &L : It->second) {
+        if (StringRef{L.FileURI} != StringRef{CallerLoc.FileURI}) {
+          elog("incomingCalls: call location not in same file as caller, this "
+               "is unexpected");
+          continue;
+        }
+        Range R;
+        R.start.line = L.Start.line();
+        R.start.character = L.Start.column();
+        R.end.line = L.End.line();
+        R.end.character = L.End.column();
+        FromRanges.push_back(R);
+      }
       Results.push_back(
-          CallHierarchyIncomingCall{std::move(*CHI), std::move(It->second)});
+          CallHierarchyIncomingCall{std::move(*CHI), std::move(FromRanges)});
+    }
   });
   // Sort results by name of container.
   llvm::sort(Results, [](const CallHierarchyIncomingCall &A,

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 9, 2024

@llvm/pr-subscribers-clangd

Author: Nathan Ridge (HighCommander4)

Changes

I don't have a concrete motivating scenario here, just something I noticed during code reading:

CallHierarchyIncomingCall::fromRanges are interpreted as ranges in the same file as the CallHierarchyItem representing the caller (CallHierarchyIncomingCall::from).

In the server implementation, we populate them based on Ref::Location, taking only the range and discarding the file, and associate them with a CallHierarchyItem representing Ref::Container.

Now, logically it should be the case that the definition location of the symbol referred to by Ref::Container is in the same file as the Ref itself... but I don't think anything guarantees this, and if for some reason this doesn't hold, we are effectively taking ranges from one file and interpreting them as being in another file.

The patch adds a check for this and discards ranges which are not in fact in the same file.


Full diff: https://github.com/llvm/llvm-project/pull/111616.diff

1 Files Affected:

  • (modified) clang-tools-extra/clangd/XRefs.cpp (+24-10)
diff --git a/clang-tools-extra/clangd/XRefs.cpp b/clang-tools-extra/clangd/XRefs.cpp
index f94cadeffaa298..cca5035fb74bd4 100644
--- a/clang-tools-extra/clangd/XRefs.cpp
+++ b/clang-tools-extra/clangd/XRefs.cpp
@@ -63,6 +63,7 @@
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/Error.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/raw_ostream.h"
 #include <optional>
@@ -2272,18 +2273,14 @@ incomingCalls(const CallHierarchyItem &Item, const SymbolIndex *Index) {
   // Initially store the ranges in a map keyed by SymbolID of the caller.
   // This allows us to group different calls with the same caller
   // into the same CallHierarchyIncomingCall.
-  llvm::DenseMap<SymbolID, std::vector<Range>> CallsIn;
+  llvm::DenseMap<SymbolID, std::vector<SymbolLocation>> CallsIn;
   // We can populate the ranges based on a refs request only. As we do so, we
   // also accumulate the container IDs into a lookup request.
   LookupRequest ContainerLookup;
   Index->refs(Request, [&](const Ref &R) {
-    auto Loc = indexToLSPLocation(R.Location, Item.uri.file());
-    if (!Loc) {
-      elog("incomingCalls failed to convert location: {0}", Loc.takeError());
-      return;
-    }
-    auto It = CallsIn.try_emplace(R.Container, std::vector<Range>{}).first;
-    It->second.push_back(Loc->range);
+    auto It =
+        CallsIn.try_emplace(R.Container, std::vector<SymbolLocation>{}).first;
+    It->second.push_back(R.Location);
 
     ContainerLookup.IDs.insert(R.Container);
   });
@@ -2292,9 +2289,26 @@ incomingCalls(const CallHierarchyItem &Item, const SymbolIndex *Index) {
   Index->lookup(ContainerLookup, [&](const Symbol &Caller) {
     auto It = CallsIn.find(Caller.ID);
     assert(It != CallsIn.end());
-    if (auto CHI = symbolToCallHierarchyItem(Caller, Item.uri.file()))
+    if (auto CHI = symbolToCallHierarchyItem(Caller, Item.uri.file())) {
+      SymbolLocation CallerLoc =
+          Caller.Definition ? Caller.Definition : Caller.CanonicalDeclaration;
+      std::vector<Range> FromRanges;
+      for (const SymbolLocation &L : It->second) {
+        if (StringRef{L.FileURI} != StringRef{CallerLoc.FileURI}) {
+          elog("incomingCalls: call location not in same file as caller, this "
+               "is unexpected");
+          continue;
+        }
+        Range R;
+        R.start.line = L.Start.line();
+        R.start.character = L.Start.column();
+        R.end.line = L.End.line();
+        R.end.character = L.End.column();
+        FromRanges.push_back(R);
+      }
       Results.push_back(
-          CallHierarchyIncomingCall{std::move(*CHI), std::move(It->second)});
+          CallHierarchyIncomingCall{std::move(*CHI), std::move(FromRanges)});
+    }
   });
   // Sort results by name of container.
   llvm::sort(Results, [](const CallHierarchyIncomingCall &A,

std::vector<Range> FromRanges;
for (const SymbolLocation &L : It->second) {
if (StringRef{L.FileURI} != StringRef{CallerLoc.FileURI}) {
elog("incomingCalls: call location not in same file as caller, this "
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we assert as well?

"is unexpected");
continue;
}
Range R;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I didn't use indexToLSPLocation() here because the URI::resolve operation that it does seems like unnecessary work here. I could factor out the part that converts the range into a function to avoid duplicating it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants