Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: [clang-tidy] [analyzer] Move nondeterministic pointer usage check to tidy #110471

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vabridgers
Copy link
Contributor

@vabridgers vabridgers commented Sep 30, 2024

This change moves the alpha.nondeterministic.PointerSorting and alpha.nondeterministic.PointerIteration static analyzer checkers to a single clang-tidy check. Those checkers were implemented as simple clang-tidy check-like code, wrapped in the static analyzer framework. The documentation was updated to describe what the checks can and cannot do, and testing was completed on a broad set of open-source projects.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang-tools-extra clang-tidy clang:static analyzer labels Sep 30, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: None (vabridgers)

Changes

RFC: This PR is not ready to merge yet, preliminary comments are welcome.
Here are some questions to consider...

Q) Is the name and file location ok in the directory structure?
I initially chose bugprone and NondeterministicPointerUsage as a
starting point for review.
Q) I wrote an explanation for the check based on internal discussion
since I felt like this was missing. Please check and comment.
Q) There are more ways to iterate over an unordered container
than captured here. Do those need to be detected as well?
Q) I squashed PointerIteration and PointerSorting. I think that works
in this case, but interested to hear your comments about that.
Q) I ended up expanding upon the C++ simulation header, but had thoughts
about breaking that up into multiple smaller files. Maybe there's an
opportunity to refactor some C++ simulation files across multiple
checkers as a seperate PR first, or maybe even as part of this one?

This change moves the alpha.nondeterministic.PointerSorting and alpha.nondeterministic.PointerIteration static analyzer checkers to a single clang-tidy check. Those checkers were implemented as clang-tidy checks wrapped in the static analyzer framework. The documentation was updated to describe what the checks can and cannot do, and testing was completed on a broad set of open source projects.


Patch is 75.02 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110471.diff

13 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp (+3)
  • (modified) clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt (+1)
  • (added) clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp (+67)
  • (added) clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h (+36)
  • (added) clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst (+31)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h (+1450)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/nondeterministic-pointer-usage.cpp (+69)
  • (modified) clang/docs/analyzer/checkers.rst (-31)
  • (modified) clang/include/clang/StaticAnalyzer/Checkers/Checkers.td (-18)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt (-2)
  • (removed) clang/lib/StaticAnalyzer/Checkers/PointerIterationChecker.cpp (-101)
  • (removed) clang/lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp (-115)
  • (removed) clang/test/Analysis/ptr-sort.cpp (-36)
diff --git a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
index 689eb92a3d8d17..7c177196d76f58 100644
--- a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
@@ -48,6 +48,7 @@
 #include "MultipleStatementMacroCheck.h"
 #include "NoEscapeCheck.h"
 #include "NonZeroEnumToBoolConversionCheck.h"
+#include "NondeterministicPointerUsageCheck.h"
 #include "NotNullTerminatedResultCheck.h"
 #include "OptionalValueConversionCheck.h"
 #include "ParentVirtualCallCheck.h"
@@ -179,6 +180,8 @@ class BugproneModule : public ClangTidyModule {
     CheckFactories.registerCheck<cppcoreguidelines::NarrowingConversionsCheck>(
         "bugprone-narrowing-conversions");
     CheckFactories.registerCheck<NoEscapeCheck>("bugprone-no-escape");
+    CheckFactories.registerCheck<NondeterministicPointerUsageCheck>(
+        "bugprone-nondeterministic-pointer-usage");
     CheckFactories.registerCheck<NonZeroEnumToBoolConversionCheck>(
         "bugprone-non-zero-enum-to-bool-conversion");
     CheckFactories.registerCheck<NotNullTerminatedResultCheck>(
diff --git a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
index cb0d8ae98bac58..5628572b984226 100644
--- a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
@@ -44,6 +44,7 @@ add_clang_library(clangTidyBugproneModule
   MultipleNewInOneExpressionCheck.cpp
   MultipleStatementMacroCheck.cpp
   NoEscapeCheck.cpp
+  NondeterministicPointerUsageCheck.cpp
   NonZeroEnumToBoolConversionCheck.cpp
   NotNullTerminatedResultCheck.cpp
   OptionalValueConversionCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp
new file mode 100644
index 00000000000000..ddc314af739d97
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp
@@ -0,0 +1,67 @@
+//===--- NondetermnisticPointerUsageCheck.cpp - clang-tidy ------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "NondeterministicPointerUsageCheck.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/Lex/Lexer.h"
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::bugprone {
+
+void NondeterministicPointerUsageCheck::registerMatchers(MatchFinder *Finder) {
+
+  auto LoopVariable = varDecl(hasType(hasCanonicalType(pointerType())));
+
+  auto RangeInit = declRefExpr(to(varDecl(hasType(recordDecl(
+      anyOf(hasName("std::unordered_set"), hasName("std::unordered_map"),
+            hasName("std::unordered_multiset"),
+            hasName("std::unordered_multimap")))))));
+
+  Finder->addMatcher(
+      stmt(cxxForRangeStmt(hasRangeInit(RangeInit.bind("rangeinit")),
+                           hasLoopVariable(LoopVariable.bind("loopVar"))))
+          .bind("cxxForRangeStmt"),
+      this);
+
+  auto SortFuncM = anyOf(callee(functionDecl(hasName("std::is_sorted"))),
+                         callee(functionDecl(hasName("std::nth_element"))),
+                         callee(functionDecl(hasName("std::sort"))),
+                         callee(functionDecl(hasName("std::partial_sort"))),
+                         callee(functionDecl(hasName("std::partition"))),
+                         callee(functionDecl(hasName("std::stable_partition"))),
+                         callee(functionDecl(hasName("std::stable_sort"))));
+
+  auto IteratesPointerEltsM = hasArgument(
+      0,
+      cxxMemberCallExpr(on(hasType(cxxRecordDecl(has(fieldDecl(hasType(
+          hasCanonicalType(pointsTo(hasCanonicalType(pointerType())))))))))));
+
+  Finder->addMatcher(stmt(callExpr(allOf(SortFuncM, IteratesPointerEltsM)))
+                         .bind("sortsemantic"),
+                     this);
+}
+
+void NondeterministicPointerUsageCheck::check(
+    const MatchFinder::MatchResult &Result) {
+  const auto *ForRangePointers =
+      Result.Nodes.getNodeAs<Stmt>("cxxForRangeStmt");
+  const auto *SortPointers = Result.Nodes.getNodeAs<Stmt>("sortsemantic");
+
+  if ((ForRangePointers) && !(ForRangePointers->getBeginLoc().isMacroID())) {
+    const auto *Node = dyn_cast<CXXForRangeStmt>(ForRangePointers);
+    diag(Node->getRParenLoc(), "Iteration of pointers is nondeterministic");
+  }
+
+  if ((SortPointers) && !(SortPointers->getBeginLoc().isMacroID())) {
+    const auto *Node = dyn_cast<Stmt>(SortPointers);
+    diag(Node->getBeginLoc(), "Sorting pointers is nondeterministic");
+  }
+}
+
+} // namespace clang::tidy::bugprone
diff --git a/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h
new file mode 100644
index 00000000000000..556a4fce4dffe2
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h
@@ -0,0 +1,36 @@
+//===--- NondeterministicPointerUsageCheck.h - clang-tidy ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
+#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
+
+#include "../ClangTidyCheck.h"
+
+namespace clang::tidy::bugprone {
+
+/// Finds temporaries that look like RAII objects.
+///
+/// For the user-facing documentation see:
+/// http://clang.llvm.org/extra/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.html
+class NondeterministicPointerUsageCheck : public ClangTidyCheck {
+public:
+  NondeterministicPointerUsageCheck(StringRef Name, ClangTidyContext *Context)
+      : ClangTidyCheck(Name, Context) {}
+  bool isLanguageVersionSupported(const LangOptions &LangOpts) const override {
+    return LangOpts.CPlusPlus;
+  }
+  void registerMatchers(ast_matchers::MatchFinder *Finder) override;
+  void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
+  std::optional<TraversalKind> getCheckTraversalKind() const override {
+    return TK_IgnoreUnlessSpelledInSource;
+  }
+};
+
+} // namespace clang::tidy::bugprone
+
+#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
diff --git a/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst b/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst
new file mode 100644
index 00000000000000..8625487c6d694a
--- /dev/null
+++ b/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst
@@ -0,0 +1,31 @@
+.. title:: clang-tidy - bugprone-nondeterministic-pointer-usage
+
+nondeterministic-pointer-usage
+==============================
+
+Finds nondeterministic usages of pointers in unordered containers.
+
+One canonical example is iteration across a container of pointers.
+
+.. code-block:: c++
+
+  {
+    for (auto i : UnorderedPtrSet)
+      f(i);
+  }
+
+Another such example is sorting a container of pointers.
+
+.. code-block:: c++
+
+  {
+    std::sort(VectorOfPtr.begin(), VectorOfPtr.end());
+  }
+
+Iteration of a containers of pointers may present the order of different
+pointers differently across different runs of a program. In some cases this
+may be acceptable behavior, in others this may be unexpected behavior. This
+check is advisory for this reason.
+
+This check only detects range-based for loops over unordered sets. Other
+similar usages will not be found and are false negatives.
diff --git a/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h b/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h
new file mode 100644
index 00000000000000..b279a8f20d8ddd
--- /dev/null
+++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h
@@ -0,0 +1,1450 @@
+// Like the compiler, the static analyzer treats some functions differently if
+// they come from a system header -- for example, it is assumed that system
+// functions do not arbitrarily free() their parameters, and that some bugs
+// found in system headers cannot be fixed by the user and should be
+// suppressed.
+#pragma clang system_header
+
+typedef unsigned char uint8_t;
+
+typedef __typeof__(sizeof(int)) size_t;
+typedef __typeof__((char*)0-(char*)0) ptrdiff_t;
+void *memmove(void *s1, const void *s2, size_t n);
+
+namespace std {
+  typedef size_t size_type;
+#if __cplusplus >= 201103L
+  using nullptr_t = decltype(nullptr);
+#endif
+}
+
+namespace std {
+  struct input_iterator_tag { };
+  struct output_iterator_tag { };
+  struct forward_iterator_tag : public input_iterator_tag { };
+  struct bidirectional_iterator_tag : public forward_iterator_tag { };
+  struct random_access_iterator_tag : public bidirectional_iterator_tag { };
+
+  template <typename Iterator> struct iterator_traits {
+    typedef typename Iterator::difference_type difference_type;
+    typedef typename Iterator::value_type value_type;
+    typedef typename Iterator::pointer pointer;
+    typedef typename Iterator::reference reference;
+    typedef typename Iterator::iterator_category iterator_category;
+  };
+}
+
+template <typename T, typename Ptr, typename Ref> struct __vector_iterator {
+  typedef __vector_iterator<T, T *, T &> iterator;
+  typedef __vector_iterator<T, const T *, const T &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::random_access_iterator_tag iterator_category;
+
+  __vector_iterator(const Ptr p = 0) : ptr(p) {}
+  __vector_iterator(const iterator &rhs): ptr(rhs.base()) {}
+  __vector_iterator<T, Ptr, Ref>& operator++() { ++ ptr; return *this; }
+  __vector_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    ++ ptr;
+    return tmp;
+  }
+  __vector_iterator<T, Ptr, Ref> operator--() { -- ptr; return *this; }
+  __vector_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this; -- ptr;
+    return tmp;
+  }
+  __vector_iterator<T, Ptr, Ref> operator+(difference_type n) {
+    return ptr + n;
+  }
+  friend __vector_iterator<T, Ptr, Ref> operator+(
+      difference_type n,
+      const __vector_iterator<T, Ptr, Ref> &iter) {
+    return n + iter.ptr;
+  }
+  __vector_iterator<T, Ptr, Ref> operator-(difference_type n) {
+    return ptr - n;
+  }
+  __vector_iterator<T, Ptr, Ref> operator+=(difference_type n) {
+    return ptr += n;
+  }
+  __vector_iterator<T, Ptr, Ref> operator-=(difference_type n) {
+    return ptr -= n;
+  }
+
+  template<typename U, typename Ptr2, typename Ref2>
+  difference_type operator-(const __vector_iterator<U, Ptr2, Ref2> &rhs);
+
+  Ref operator*() const { return *ptr; }
+  Ptr operator->() const { return ptr; }
+
+  Ref operator[](difference_type n) {
+    return *(ptr+n);
+  }
+
+  bool operator==(const iterator &rhs) const { return ptr == rhs.ptr; }
+  bool operator==(const const_iterator &rhs) const { return ptr == rhs.ptr; }
+
+  bool operator!=(const iterator &rhs) const { return ptr != rhs.ptr; }
+  bool operator!=(const const_iterator &rhs) const { return ptr != rhs.ptr; }
+
+  const Ptr& base() const { return ptr; }
+
+private:
+  Ptr ptr;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __deque_iterator {
+  typedef __deque_iterator<T, T *, T &> iterator;
+  typedef __deque_iterator<T, const T *, const T &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::random_access_iterator_tag iterator_category;
+
+  __deque_iterator(const Ptr p = 0) : ptr(p) {}
+  __deque_iterator(const iterator &rhs): ptr(rhs.base()) {}
+  __deque_iterator<T, Ptr, Ref>& operator++() { ++ ptr; return *this; }
+  __deque_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    ++ ptr;
+    return tmp;
+  }
+  __deque_iterator<T, Ptr, Ref> operator--() { -- ptr; return *this; }
+  __deque_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this; -- ptr;
+    return tmp;
+  }
+  __deque_iterator<T, Ptr, Ref> operator+(difference_type n) {
+    return ptr + n;
+  }
+  friend __deque_iterator<T, Ptr, Ref> operator+(
+      difference_type n,
+      const __deque_iterator<T, Ptr, Ref> &iter) {
+    return n + iter.ptr;
+  }
+  __deque_iterator<T, Ptr, Ref> operator-(difference_type n) {
+    return ptr - n;
+  }
+  __deque_iterator<T, Ptr, Ref> operator+=(difference_type n) {
+    return ptr += n;
+  }
+  __deque_iterator<T, Ptr, Ref> operator-=(difference_type n) {
+    return ptr -= n;
+  }
+
+  Ref operator*() const { return *ptr; }
+  Ptr operator->() const { return ptr; }
+
+  Ref operator[](difference_type n) {
+    return *(ptr+n);
+  }
+
+  bool operator==(const iterator &rhs) const { return ptr == rhs.ptr; }
+  bool operator==(const const_iterator &rhs) const { return ptr == rhs.ptr; }
+
+  bool operator!=(const iterator &rhs) const { return ptr != rhs.ptr; }
+  bool operator!=(const const_iterator &rhs) const { return ptr != rhs.ptr; }
+
+  const Ptr& base() const { return ptr; }
+
+private:
+  Ptr ptr;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __list_iterator {
+  typedef __list_iterator<T, __typeof__(T::data) *, __typeof__(T::data) &> iterator;
+  typedef __list_iterator<T, const __typeof__(T::data) *, const __typeof__(T::data) &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::bidirectional_iterator_tag iterator_category;
+
+  __list_iterator(T* it = 0) : item(it) {}
+  __list_iterator(const iterator &rhs): item(rhs.item) {}
+  __list_iterator<T, Ptr, Ref>& operator++() { item = item->next; return *this; }
+  __list_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    item = item->next;
+    return tmp;
+  }
+  __list_iterator<T, Ptr, Ref> operator--() { item = item->prev; return *this; }
+  __list_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this;
+    item = item->prev;
+    return tmp;
+  }
+
+  Ref operator*() const { return item->data; }
+  Ptr operator->() const { return &item->data; }
+
+  bool operator==(const iterator &rhs) const { return item == rhs->item; }
+  bool operator==(const const_iterator &rhs) const { return item == rhs->item; }
+
+  bool operator!=(const iterator &rhs) const { return item != rhs->item; }
+  bool operator!=(const const_iterator &rhs) const { return item != rhs->item; }
+
+  const T* &base() const { return item; }
+
+  template <typename UT, typename UPtr, typename URef>
+  friend struct __list_iterator;
+
+private:
+  T* item;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __fwdl_iterator {
+  typedef __fwdl_iterator<T, __typeof__(T::data) *, __typeof__(T::data) &> iterator;
+  typedef __fwdl_iterator<T, const __typeof__(T::data) *, const __typeof__(T::data) &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::forward_iterator_tag iterator_category;
+
+  __fwdl_iterator(T* it = 0) : item(it) {}
+  __fwdl_iterator(const iterator &rhs): item(rhs.item) {}
+  __fwdl_iterator<T, Ptr, Ref>& operator++() { item = item->next; return *this; }
+  __fwdl_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    item = item->next;
+    return tmp;
+  }
+  Ref operator*() const { return item->data; }
+  Ptr operator->() const { return &item->data; }
+
+  bool operator==(const iterator &rhs) const { return item == rhs->item; }
+  bool operator==(const const_iterator &rhs) const { return item == rhs->item; }
+
+  bool operator!=(const iterator &rhs) const { return item != rhs->item; }
+  bool operator!=(const const_iterator &rhs) const { return item != rhs->item; }
+
+  const T* &base() const { return item; }
+
+  template <typename UT, typename UPtr, typename URef>
+  friend struct __fwdl_iterator;
+
+private:
+  T* item;
+};
+
+namespace std {
+  template <class T1, class T2>
+  struct pair {
+    T1 first;
+    T2 second;
+
+    pair() : first(), second() {}
+    pair(const T1 &a, const T2 &b) : first(a), second(b) {}
+
+    template<class U1, class U2>
+    pair(const pair<U1, U2> &other) : first(other.first),
+                                      second(other.second) {}
+  };
+
+  template<class T2, class T1>
+  T2& get(pair<T1, T2>& p) ;
+  template<class T1, class T2>
+  T1& get(const pair<T1, T2>& p) ;
+
+  typedef __typeof__(sizeof(int)) size_t;
+
+  template <class T> class initializer_list;
+
+  template< class T > struct remove_reference      {typedef T type;};
+  template< class T > struct remove_reference<T&>  {typedef T type;};
+  template< class T > struct remove_reference<T&&> {typedef T type;};
+
+  template<typename T> typename remove_reference<T>::type&& move(T&& a);
+  template <typename T> T *__addressof(T &x);
+  template <typename T> T *addressof(T &x);
+  template <typename T> const T& as_const(T& x);
+  template <typename T> T&& forward(T&& x);
+  // FIXME: Declare forward_like
+  // FIXME: Declare move_if_noexcept
+
+  template< class T >
+  using remove_reference_t = typename remove_reference<T>::type;
+
+  template <class T>
+  void swap(T &a, T &b) {
+    T c(std::move(a));
+    a = std::move(b);
+    b = std::move(c);
+  }
+
+  template<typename T>
+  class vector {
+    T *_start;
+    T *_finish;
+    T *_end_of_storage;
+
+  public:
+    typedef T value_type;
+    typedef size_t size_type;
+    typedef __vector_iterator<T, T *, T &> iterator;
+    typedef __vector_iterator<T, const T *, const T &> const_iterator;
+
+    vector() : _start(0), _finish(0), _end_of_storage(0) {}
+    template <typename InputIterator>
+    vector(InputIterator first, InputIterator last);
+    vector(const vector &other);
+    vector(vector &&other);
+    ~vector();
+
+    size_t size() const {
+      return size_t(_finish - _start);
+    }
+
+    vector& operator=(const vector &other);
+    vector& operator=(vector &&other);
+    vector& operator=(std::initializer_list<T> ilist);
+
+    void assign(size_type count, const T &value);
+    template <typename InputIterator >
+    void assign(InputIterator first, InputIterator last);
+    void assign(std::initializer_list<T> ilist);
+
+    void clear();
+
+    void push_back(const T &value);
+    void push_back(T &&value);
+    template<class... Args>
+    void emplace_back(Args&&... args);
+    void pop_back();
+
+    iterator insert(const_iterator position, const value_type &val);
+    iterator insert(const_iterator position, size_type n,
+                    const value_type &val);
+    template <typename InputIterator>
+    iterator insert(const_iterator position, InputIterator first,
+                    InputIterator last);
+    iterator insert(const_iterator position, value_type &&val);
+    iterator insert(const_iterator position, initializer_list<value_type> il);
+
+    template <class... Args>
+    iterator emplace(const_iterator position, Args&&... args);
+
+    iterator erase(const_iterator position);
+    iterator erase(const_iterator first, const_iterator last);
+
+    T &operator[](size_t n) {
+      return _start[n];
+    }
+
+    const T &operator[](size_t n) const {
+      return _start[n];
+    }
+
+    iterator begin() { return iterator(_start); }
+    const_iterator begin() const { return const_iterator(_start); }
+    const_iterator cbegin() const { return const_iterator(_start); }
+    iterator end() { return iterator(_finish); }
+    const_iterator end() const { return const_iterator(_finish); }
+    const_iterator cend() const { return const_iterator(_finish); }
+    T& front() { return *begin(); }
+    const T& front() const { return *begin(); }
+    T& back() { return *(end() - 1); }
+    const T& back() const { re...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-clang-tidy

Author: None (vabridgers)

Changes

RFC: This PR is not ready to merge yet, preliminary comments are welcome.
Here are some questions to consider...

Q) Is the name and file location ok in the directory structure?
I initially chose bugprone and NondeterministicPointerUsage as a
starting point for review.
Q) I wrote an explanation for the check based on internal discussion
since I felt like this was missing. Please check and comment.
Q) There are more ways to iterate over an unordered container
than captured here. Do those need to be detected as well?
Q) I squashed PointerIteration and PointerSorting. I think that works
in this case, but interested to hear your comments about that.
Q) I ended up expanding upon the C++ simulation header, but had thoughts
about breaking that up into multiple smaller files. Maybe there's an
opportunity to refactor some C++ simulation files across multiple
checkers as a seperate PR first, or maybe even as part of this one?

This change moves the alpha.nondeterministic.PointerSorting and alpha.nondeterministic.PointerIteration static analyzer checkers to a single clang-tidy check. Those checkers were implemented as clang-tidy checks wrapped in the static analyzer framework. The documentation was updated to describe what the checks can and cannot do, and testing was completed on a broad set of open source projects.


Patch is 75.02 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110471.diff

13 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp (+3)
  • (modified) clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt (+1)
  • (added) clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp (+67)
  • (added) clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h (+36)
  • (added) clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst (+31)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h (+1450)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/nondeterministic-pointer-usage.cpp (+69)
  • (modified) clang/docs/analyzer/checkers.rst (-31)
  • (modified) clang/include/clang/StaticAnalyzer/Checkers/Checkers.td (-18)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt (-2)
  • (removed) clang/lib/StaticAnalyzer/Checkers/PointerIterationChecker.cpp (-101)
  • (removed) clang/lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp (-115)
  • (removed) clang/test/Analysis/ptr-sort.cpp (-36)
diff --git a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
index 689eb92a3d8d17..7c177196d76f58 100644
--- a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
@@ -48,6 +48,7 @@
 #include "MultipleStatementMacroCheck.h"
 #include "NoEscapeCheck.h"
 #include "NonZeroEnumToBoolConversionCheck.h"
+#include "NondeterministicPointerUsageCheck.h"
 #include "NotNullTerminatedResultCheck.h"
 #include "OptionalValueConversionCheck.h"
 #include "ParentVirtualCallCheck.h"
@@ -179,6 +180,8 @@ class BugproneModule : public ClangTidyModule {
     CheckFactories.registerCheck<cppcoreguidelines::NarrowingConversionsCheck>(
         "bugprone-narrowing-conversions");
     CheckFactories.registerCheck<NoEscapeCheck>("bugprone-no-escape");
+    CheckFactories.registerCheck<NondeterministicPointerUsageCheck>(
+        "bugprone-nondeterministic-pointer-usage");
     CheckFactories.registerCheck<NonZeroEnumToBoolConversionCheck>(
         "bugprone-non-zero-enum-to-bool-conversion");
     CheckFactories.registerCheck<NotNullTerminatedResultCheck>(
diff --git a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
index cb0d8ae98bac58..5628572b984226 100644
--- a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
@@ -44,6 +44,7 @@ add_clang_library(clangTidyBugproneModule
   MultipleNewInOneExpressionCheck.cpp
   MultipleStatementMacroCheck.cpp
   NoEscapeCheck.cpp
+  NondeterministicPointerUsageCheck.cpp
   NonZeroEnumToBoolConversionCheck.cpp
   NotNullTerminatedResultCheck.cpp
   OptionalValueConversionCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp
new file mode 100644
index 00000000000000..ddc314af739d97
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp
@@ -0,0 +1,67 @@
+//===--- NondetermnisticPointerUsageCheck.cpp - clang-tidy ------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "NondeterministicPointerUsageCheck.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/Lex/Lexer.h"
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::bugprone {
+
+void NondeterministicPointerUsageCheck::registerMatchers(MatchFinder *Finder) {
+
+  auto LoopVariable = varDecl(hasType(hasCanonicalType(pointerType())));
+
+  auto RangeInit = declRefExpr(to(varDecl(hasType(recordDecl(
+      anyOf(hasName("std::unordered_set"), hasName("std::unordered_map"),
+            hasName("std::unordered_multiset"),
+            hasName("std::unordered_multimap")))))));
+
+  Finder->addMatcher(
+      stmt(cxxForRangeStmt(hasRangeInit(RangeInit.bind("rangeinit")),
+                           hasLoopVariable(LoopVariable.bind("loopVar"))))
+          .bind("cxxForRangeStmt"),
+      this);
+
+  auto SortFuncM = anyOf(callee(functionDecl(hasName("std::is_sorted"))),
+                         callee(functionDecl(hasName("std::nth_element"))),
+                         callee(functionDecl(hasName("std::sort"))),
+                         callee(functionDecl(hasName("std::partial_sort"))),
+                         callee(functionDecl(hasName("std::partition"))),
+                         callee(functionDecl(hasName("std::stable_partition"))),
+                         callee(functionDecl(hasName("std::stable_sort"))));
+
+  auto IteratesPointerEltsM = hasArgument(
+      0,
+      cxxMemberCallExpr(on(hasType(cxxRecordDecl(has(fieldDecl(hasType(
+          hasCanonicalType(pointsTo(hasCanonicalType(pointerType())))))))))));
+
+  Finder->addMatcher(stmt(callExpr(allOf(SortFuncM, IteratesPointerEltsM)))
+                         .bind("sortsemantic"),
+                     this);
+}
+
+void NondeterministicPointerUsageCheck::check(
+    const MatchFinder::MatchResult &Result) {
+  const auto *ForRangePointers =
+      Result.Nodes.getNodeAs<Stmt>("cxxForRangeStmt");
+  const auto *SortPointers = Result.Nodes.getNodeAs<Stmt>("sortsemantic");
+
+  if ((ForRangePointers) && !(ForRangePointers->getBeginLoc().isMacroID())) {
+    const auto *Node = dyn_cast<CXXForRangeStmt>(ForRangePointers);
+    diag(Node->getRParenLoc(), "Iteration of pointers is nondeterministic");
+  }
+
+  if ((SortPointers) && !(SortPointers->getBeginLoc().isMacroID())) {
+    const auto *Node = dyn_cast<Stmt>(SortPointers);
+    diag(Node->getBeginLoc(), "Sorting pointers is nondeterministic");
+  }
+}
+
+} // namespace clang::tidy::bugprone
diff --git a/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h
new file mode 100644
index 00000000000000..556a4fce4dffe2
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h
@@ -0,0 +1,36 @@
+//===--- NondeterministicPointerUsageCheck.h - clang-tidy ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
+#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
+
+#include "../ClangTidyCheck.h"
+
+namespace clang::tidy::bugprone {
+
+/// Finds temporaries that look like RAII objects.
+///
+/// For the user-facing documentation see:
+/// http://clang.llvm.org/extra/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.html
+class NondeterministicPointerUsageCheck : public ClangTidyCheck {
+public:
+  NondeterministicPointerUsageCheck(StringRef Name, ClangTidyContext *Context)
+      : ClangTidyCheck(Name, Context) {}
+  bool isLanguageVersionSupported(const LangOptions &LangOpts) const override {
+    return LangOpts.CPlusPlus;
+  }
+  void registerMatchers(ast_matchers::MatchFinder *Finder) override;
+  void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
+  std::optional<TraversalKind> getCheckTraversalKind() const override {
+    return TK_IgnoreUnlessSpelledInSource;
+  }
+};
+
+} // namespace clang::tidy::bugprone
+
+#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
diff --git a/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst b/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst
new file mode 100644
index 00000000000000..8625487c6d694a
--- /dev/null
+++ b/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst
@@ -0,0 +1,31 @@
+.. title:: clang-tidy - bugprone-nondeterministic-pointer-usage
+
+nondeterministic-pointer-usage
+==============================
+
+Finds nondeterministic usages of pointers in unordered containers.
+
+One canonical example is iteration across a container of pointers.
+
+.. code-block:: c++
+
+  {
+    for (auto i : UnorderedPtrSet)
+      f(i);
+  }
+
+Another such example is sorting a container of pointers.
+
+.. code-block:: c++
+
+  {
+    std::sort(VectorOfPtr.begin(), VectorOfPtr.end());
+  }
+
+Iteration of a containers of pointers may present the order of different
+pointers differently across different runs of a program. In some cases this
+may be acceptable behavior, in others this may be unexpected behavior. This
+check is advisory for this reason.
+
+This check only detects range-based for loops over unordered sets. Other
+similar usages will not be found and are false negatives.
diff --git a/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h b/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h
new file mode 100644
index 00000000000000..b279a8f20d8ddd
--- /dev/null
+++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h
@@ -0,0 +1,1450 @@
+// Like the compiler, the static analyzer treats some functions differently if
+// they come from a system header -- for example, it is assumed that system
+// functions do not arbitrarily free() their parameters, and that some bugs
+// found in system headers cannot be fixed by the user and should be
+// suppressed.
+#pragma clang system_header
+
+typedef unsigned char uint8_t;
+
+typedef __typeof__(sizeof(int)) size_t;
+typedef __typeof__((char*)0-(char*)0) ptrdiff_t;
+void *memmove(void *s1, const void *s2, size_t n);
+
+namespace std {
+  typedef size_t size_type;
+#if __cplusplus >= 201103L
+  using nullptr_t = decltype(nullptr);
+#endif
+}
+
+namespace std {
+  struct input_iterator_tag { };
+  struct output_iterator_tag { };
+  struct forward_iterator_tag : public input_iterator_tag { };
+  struct bidirectional_iterator_tag : public forward_iterator_tag { };
+  struct random_access_iterator_tag : public bidirectional_iterator_tag { };
+
+  template <typename Iterator> struct iterator_traits {
+    typedef typename Iterator::difference_type difference_type;
+    typedef typename Iterator::value_type value_type;
+    typedef typename Iterator::pointer pointer;
+    typedef typename Iterator::reference reference;
+    typedef typename Iterator::iterator_category iterator_category;
+  };
+}
+
+template <typename T, typename Ptr, typename Ref> struct __vector_iterator {
+  typedef __vector_iterator<T, T *, T &> iterator;
+  typedef __vector_iterator<T, const T *, const T &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::random_access_iterator_tag iterator_category;
+
+  __vector_iterator(const Ptr p = 0) : ptr(p) {}
+  __vector_iterator(const iterator &rhs): ptr(rhs.base()) {}
+  __vector_iterator<T, Ptr, Ref>& operator++() { ++ ptr; return *this; }
+  __vector_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    ++ ptr;
+    return tmp;
+  }
+  __vector_iterator<T, Ptr, Ref> operator--() { -- ptr; return *this; }
+  __vector_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this; -- ptr;
+    return tmp;
+  }
+  __vector_iterator<T, Ptr, Ref> operator+(difference_type n) {
+    return ptr + n;
+  }
+  friend __vector_iterator<T, Ptr, Ref> operator+(
+      difference_type n,
+      const __vector_iterator<T, Ptr, Ref> &iter) {
+    return n + iter.ptr;
+  }
+  __vector_iterator<T, Ptr, Ref> operator-(difference_type n) {
+    return ptr - n;
+  }
+  __vector_iterator<T, Ptr, Ref> operator+=(difference_type n) {
+    return ptr += n;
+  }
+  __vector_iterator<T, Ptr, Ref> operator-=(difference_type n) {
+    return ptr -= n;
+  }
+
+  template<typename U, typename Ptr2, typename Ref2>
+  difference_type operator-(const __vector_iterator<U, Ptr2, Ref2> &rhs);
+
+  Ref operator*() const { return *ptr; }
+  Ptr operator->() const { return ptr; }
+
+  Ref operator[](difference_type n) {
+    return *(ptr+n);
+  }
+
+  bool operator==(const iterator &rhs) const { return ptr == rhs.ptr; }
+  bool operator==(const const_iterator &rhs) const { return ptr == rhs.ptr; }
+
+  bool operator!=(const iterator &rhs) const { return ptr != rhs.ptr; }
+  bool operator!=(const const_iterator &rhs) const { return ptr != rhs.ptr; }
+
+  const Ptr& base() const { return ptr; }
+
+private:
+  Ptr ptr;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __deque_iterator {
+  typedef __deque_iterator<T, T *, T &> iterator;
+  typedef __deque_iterator<T, const T *, const T &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::random_access_iterator_tag iterator_category;
+
+  __deque_iterator(const Ptr p = 0) : ptr(p) {}
+  __deque_iterator(const iterator &rhs): ptr(rhs.base()) {}
+  __deque_iterator<T, Ptr, Ref>& operator++() { ++ ptr; return *this; }
+  __deque_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    ++ ptr;
+    return tmp;
+  }
+  __deque_iterator<T, Ptr, Ref> operator--() { -- ptr; return *this; }
+  __deque_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this; -- ptr;
+    return tmp;
+  }
+  __deque_iterator<T, Ptr, Ref> operator+(difference_type n) {
+    return ptr + n;
+  }
+  friend __deque_iterator<T, Ptr, Ref> operator+(
+      difference_type n,
+      const __deque_iterator<T, Ptr, Ref> &iter) {
+    return n + iter.ptr;
+  }
+  __deque_iterator<T, Ptr, Ref> operator-(difference_type n) {
+    return ptr - n;
+  }
+  __deque_iterator<T, Ptr, Ref> operator+=(difference_type n) {
+    return ptr += n;
+  }
+  __deque_iterator<T, Ptr, Ref> operator-=(difference_type n) {
+    return ptr -= n;
+  }
+
+  Ref operator*() const { return *ptr; }
+  Ptr operator->() const { return ptr; }
+
+  Ref operator[](difference_type n) {
+    return *(ptr+n);
+  }
+
+  bool operator==(const iterator &rhs) const { return ptr == rhs.ptr; }
+  bool operator==(const const_iterator &rhs) const { return ptr == rhs.ptr; }
+
+  bool operator!=(const iterator &rhs) const { return ptr != rhs.ptr; }
+  bool operator!=(const const_iterator &rhs) const { return ptr != rhs.ptr; }
+
+  const Ptr& base() const { return ptr; }
+
+private:
+  Ptr ptr;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __list_iterator {
+  typedef __list_iterator<T, __typeof__(T::data) *, __typeof__(T::data) &> iterator;
+  typedef __list_iterator<T, const __typeof__(T::data) *, const __typeof__(T::data) &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::bidirectional_iterator_tag iterator_category;
+
+  __list_iterator(T* it = 0) : item(it) {}
+  __list_iterator(const iterator &rhs): item(rhs.item) {}
+  __list_iterator<T, Ptr, Ref>& operator++() { item = item->next; return *this; }
+  __list_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    item = item->next;
+    return tmp;
+  }
+  __list_iterator<T, Ptr, Ref> operator--() { item = item->prev; return *this; }
+  __list_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this;
+    item = item->prev;
+    return tmp;
+  }
+
+  Ref operator*() const { return item->data; }
+  Ptr operator->() const { return &item->data; }
+
+  bool operator==(const iterator &rhs) const { return item == rhs->item; }
+  bool operator==(const const_iterator &rhs) const { return item == rhs->item; }
+
+  bool operator!=(const iterator &rhs) const { return item != rhs->item; }
+  bool operator!=(const const_iterator &rhs) const { return item != rhs->item; }
+
+  const T* &base() const { return item; }
+
+  template <typename UT, typename UPtr, typename URef>
+  friend struct __list_iterator;
+
+private:
+  T* item;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __fwdl_iterator {
+  typedef __fwdl_iterator<T, __typeof__(T::data) *, __typeof__(T::data) &> iterator;
+  typedef __fwdl_iterator<T, const __typeof__(T::data) *, const __typeof__(T::data) &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::forward_iterator_tag iterator_category;
+
+  __fwdl_iterator(T* it = 0) : item(it) {}
+  __fwdl_iterator(const iterator &rhs): item(rhs.item) {}
+  __fwdl_iterator<T, Ptr, Ref>& operator++() { item = item->next; return *this; }
+  __fwdl_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    item = item->next;
+    return tmp;
+  }
+  Ref operator*() const { return item->data; }
+  Ptr operator->() const { return &item->data; }
+
+  bool operator==(const iterator &rhs) const { return item == rhs->item; }
+  bool operator==(const const_iterator &rhs) const { return item == rhs->item; }
+
+  bool operator!=(const iterator &rhs) const { return item != rhs->item; }
+  bool operator!=(const const_iterator &rhs) const { return item != rhs->item; }
+
+  const T* &base() const { return item; }
+
+  template <typename UT, typename UPtr, typename URef>
+  friend struct __fwdl_iterator;
+
+private:
+  T* item;
+};
+
+namespace std {
+  template <class T1, class T2>
+  struct pair {
+    T1 first;
+    T2 second;
+
+    pair() : first(), second() {}
+    pair(const T1 &a, const T2 &b) : first(a), second(b) {}
+
+    template<class U1, class U2>
+    pair(const pair<U1, U2> &other) : first(other.first),
+                                      second(other.second) {}
+  };
+
+  template<class T2, class T1>
+  T2& get(pair<T1, T2>& p) ;
+  template<class T1, class T2>
+  T1& get(const pair<T1, T2>& p) ;
+
+  typedef __typeof__(sizeof(int)) size_t;
+
+  template <class T> class initializer_list;
+
+  template< class T > struct remove_reference      {typedef T type;};
+  template< class T > struct remove_reference<T&>  {typedef T type;};
+  template< class T > struct remove_reference<T&&> {typedef T type;};
+
+  template<typename T> typename remove_reference<T>::type&& move(T&& a);
+  template <typename T> T *__addressof(T &x);
+  template <typename T> T *addressof(T &x);
+  template <typename T> const T& as_const(T& x);
+  template <typename T> T&& forward(T&& x);
+  // FIXME: Declare forward_like
+  // FIXME: Declare move_if_noexcept
+
+  template< class T >
+  using remove_reference_t = typename remove_reference<T>::type;
+
+  template <class T>
+  void swap(T &a, T &b) {
+    T c(std::move(a));
+    a = std::move(b);
+    b = std::move(c);
+  }
+
+  template<typename T>
+  class vector {
+    T *_start;
+    T *_finish;
+    T *_end_of_storage;
+
+  public:
+    typedef T value_type;
+    typedef size_t size_type;
+    typedef __vector_iterator<T, T *, T &> iterator;
+    typedef __vector_iterator<T, const T *, const T &> const_iterator;
+
+    vector() : _start(0), _finish(0), _end_of_storage(0) {}
+    template <typename InputIterator>
+    vector(InputIterator first, InputIterator last);
+    vector(const vector &other);
+    vector(vector &&other);
+    ~vector();
+
+    size_t size() const {
+      return size_t(_finish - _start);
+    }
+
+    vector& operator=(const vector &other);
+    vector& operator=(vector &&other);
+    vector& operator=(std::initializer_list<T> ilist);
+
+    void assign(size_type count, const T &value);
+    template <typename InputIterator >
+    void assign(InputIterator first, InputIterator last);
+    void assign(std::initializer_list<T> ilist);
+
+    void clear();
+
+    void push_back(const T &value);
+    void push_back(T &&value);
+    template<class... Args>
+    void emplace_back(Args&&... args);
+    void pop_back();
+
+    iterator insert(const_iterator position, const value_type &val);
+    iterator insert(const_iterator position, size_type n,
+                    const value_type &val);
+    template <typename InputIterator>
+    iterator insert(const_iterator position, InputIterator first,
+                    InputIterator last);
+    iterator insert(const_iterator position, value_type &&val);
+    iterator insert(const_iterator position, initializer_list<value_type> il);
+
+    template <class... Args>
+    iterator emplace(const_iterator position, Args&&... args);
+
+    iterator erase(const_iterator position);
+    iterator erase(const_iterator first, const_iterator last);
+
+    T &operator[](size_t n) {
+      return _start[n];
+    }
+
+    const T &operator[](size_t n) const {
+      return _start[n];
+    }
+
+    iterator begin() { return iterator(_start); }
+    const_iterator begin() const { return const_iterator(_start); }
+    const_iterator cbegin() const { return const_iterator(_start); }
+    iterator end() { return iterator(_finish); }
+    const_iterator end() const { return const_iterator(_finish); }
+    const_iterator cend() const { return const_iterator(_finish); }
+    T& front() { return *begin(); }
+    const T& front() const { return *begin(); }
+    T& back() { return *(end() - 1); }
+    const T& back() const { re...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-clang-tools-extra

Author: None (vabridgers)

Changes

RFC: This PR is not ready to merge yet, preliminary comments are welcome.
Here are some questions to consider...

Q) Is the name and file location ok in the directory structure?
I initially chose bugprone and NondeterministicPointerUsage as a
starting point for review.
Q) I wrote an explanation for the check based on internal discussion
since I felt like this was missing. Please check and comment.
Q) There are more ways to iterate over an unordered container
than captured here. Do those need to be detected as well?
Q) I squashed PointerIteration and PointerSorting. I think that works
in this case, but interested to hear your comments about that.
Q) I ended up expanding upon the C++ simulation header, but had thoughts
about breaking that up into multiple smaller files. Maybe there's an
opportunity to refactor some C++ simulation files across multiple
checkers as a seperate PR first, or maybe even as part of this one?

This change moves the alpha.nondeterministic.PointerSorting and alpha.nondeterministic.PointerIteration static analyzer checkers to a single clang-tidy check. Those checkers were implemented as clang-tidy checks wrapped in the static analyzer framework. The documentation was updated to describe what the checks can and cannot do, and testing was completed on a broad set of open source projects.


Patch is 75.02 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110471.diff

13 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp (+3)
  • (modified) clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt (+1)
  • (added) clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp (+67)
  • (added) clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h (+36)
  • (added) clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst (+31)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h (+1450)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/nondeterministic-pointer-usage.cpp (+69)
  • (modified) clang/docs/analyzer/checkers.rst (-31)
  • (modified) clang/include/clang/StaticAnalyzer/Checkers/Checkers.td (-18)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt (-2)
  • (removed) clang/lib/StaticAnalyzer/Checkers/PointerIterationChecker.cpp (-101)
  • (removed) clang/lib/StaticAnalyzer/Checkers/PointerSortingChecker.cpp (-115)
  • (removed) clang/test/Analysis/ptr-sort.cpp (-36)
diff --git a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
index 689eb92a3d8d17..7c177196d76f58 100644
--- a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
@@ -48,6 +48,7 @@
 #include "MultipleStatementMacroCheck.h"
 #include "NoEscapeCheck.h"
 #include "NonZeroEnumToBoolConversionCheck.h"
+#include "NondeterministicPointerUsageCheck.h"
 #include "NotNullTerminatedResultCheck.h"
 #include "OptionalValueConversionCheck.h"
 #include "ParentVirtualCallCheck.h"
@@ -179,6 +180,8 @@ class BugproneModule : public ClangTidyModule {
     CheckFactories.registerCheck<cppcoreguidelines::NarrowingConversionsCheck>(
         "bugprone-narrowing-conversions");
     CheckFactories.registerCheck<NoEscapeCheck>("bugprone-no-escape");
+    CheckFactories.registerCheck<NondeterministicPointerUsageCheck>(
+        "bugprone-nondeterministic-pointer-usage");
     CheckFactories.registerCheck<NonZeroEnumToBoolConversionCheck>(
         "bugprone-non-zero-enum-to-bool-conversion");
     CheckFactories.registerCheck<NotNullTerminatedResultCheck>(
diff --git a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
index cb0d8ae98bac58..5628572b984226 100644
--- a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
@@ -44,6 +44,7 @@ add_clang_library(clangTidyBugproneModule
   MultipleNewInOneExpressionCheck.cpp
   MultipleStatementMacroCheck.cpp
   NoEscapeCheck.cpp
+  NondeterministicPointerUsageCheck.cpp
   NonZeroEnumToBoolConversionCheck.cpp
   NotNullTerminatedResultCheck.cpp
   OptionalValueConversionCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp
new file mode 100644
index 00000000000000..ddc314af739d97
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.cpp
@@ -0,0 +1,67 @@
+//===--- NondetermnisticPointerUsageCheck.cpp - clang-tidy ------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "NondeterministicPointerUsageCheck.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/Lex/Lexer.h"
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::bugprone {
+
+void NondeterministicPointerUsageCheck::registerMatchers(MatchFinder *Finder) {
+
+  auto LoopVariable = varDecl(hasType(hasCanonicalType(pointerType())));
+
+  auto RangeInit = declRefExpr(to(varDecl(hasType(recordDecl(
+      anyOf(hasName("std::unordered_set"), hasName("std::unordered_map"),
+            hasName("std::unordered_multiset"),
+            hasName("std::unordered_multimap")))))));
+
+  Finder->addMatcher(
+      stmt(cxxForRangeStmt(hasRangeInit(RangeInit.bind("rangeinit")),
+                           hasLoopVariable(LoopVariable.bind("loopVar"))))
+          .bind("cxxForRangeStmt"),
+      this);
+
+  auto SortFuncM = anyOf(callee(functionDecl(hasName("std::is_sorted"))),
+                         callee(functionDecl(hasName("std::nth_element"))),
+                         callee(functionDecl(hasName("std::sort"))),
+                         callee(functionDecl(hasName("std::partial_sort"))),
+                         callee(functionDecl(hasName("std::partition"))),
+                         callee(functionDecl(hasName("std::stable_partition"))),
+                         callee(functionDecl(hasName("std::stable_sort"))));
+
+  auto IteratesPointerEltsM = hasArgument(
+      0,
+      cxxMemberCallExpr(on(hasType(cxxRecordDecl(has(fieldDecl(hasType(
+          hasCanonicalType(pointsTo(hasCanonicalType(pointerType())))))))))));
+
+  Finder->addMatcher(stmt(callExpr(allOf(SortFuncM, IteratesPointerEltsM)))
+                         .bind("sortsemantic"),
+                     this);
+}
+
+void NondeterministicPointerUsageCheck::check(
+    const MatchFinder::MatchResult &Result) {
+  const auto *ForRangePointers =
+      Result.Nodes.getNodeAs<Stmt>("cxxForRangeStmt");
+  const auto *SortPointers = Result.Nodes.getNodeAs<Stmt>("sortsemantic");
+
+  if ((ForRangePointers) && !(ForRangePointers->getBeginLoc().isMacroID())) {
+    const auto *Node = dyn_cast<CXXForRangeStmt>(ForRangePointers);
+    diag(Node->getRParenLoc(), "Iteration of pointers is nondeterministic");
+  }
+
+  if ((SortPointers) && !(SortPointers->getBeginLoc().isMacroID())) {
+    const auto *Node = dyn_cast<Stmt>(SortPointers);
+    diag(Node->getBeginLoc(), "Sorting pointers is nondeterministic");
+  }
+}
+
+} // namespace clang::tidy::bugprone
diff --git a/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h
new file mode 100644
index 00000000000000..556a4fce4dffe2
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NondeterministicPointerUsageCheck.h
@@ -0,0 +1,36 @@
+//===--- NondeterministicPointerUsageCheck.h - clang-tidy ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
+#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
+
+#include "../ClangTidyCheck.h"
+
+namespace clang::tidy::bugprone {
+
+/// Finds temporaries that look like RAII objects.
+///
+/// For the user-facing documentation see:
+/// http://clang.llvm.org/extra/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.html
+class NondeterministicPointerUsageCheck : public ClangTidyCheck {
+public:
+  NondeterministicPointerUsageCheck(StringRef Name, ClangTidyContext *Context)
+      : ClangTidyCheck(Name, Context) {}
+  bool isLanguageVersionSupported(const LangOptions &LangOpts) const override {
+    return LangOpts.CPlusPlus;
+  }
+  void registerMatchers(ast_matchers::MatchFinder *Finder) override;
+  void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
+  std::optional<TraversalKind> getCheckTraversalKind() const override {
+    return TK_IgnoreUnlessSpelledInSource;
+  }
+};
+
+} // namespace clang::tidy::bugprone
+
+#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE__NONDETERMINISTICPOINTERUSAGECHECK_H
diff --git a/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst b/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst
new file mode 100644
index 00000000000000..8625487c6d694a
--- /dev/null
+++ b/clang-tools-extra/docs/clang-tidy/checks/bugprone/nondeterministic-pointer-usage.rst
@@ -0,0 +1,31 @@
+.. title:: clang-tidy - bugprone-nondeterministic-pointer-usage
+
+nondeterministic-pointer-usage
+==============================
+
+Finds nondeterministic usages of pointers in unordered containers.
+
+One canonical example is iteration across a container of pointers.
+
+.. code-block:: c++
+
+  {
+    for (auto i : UnorderedPtrSet)
+      f(i);
+  }
+
+Another such example is sorting a container of pointers.
+
+.. code-block:: c++
+
+  {
+    std::sort(VectorOfPtr.begin(), VectorOfPtr.end());
+  }
+
+Iteration of a containers of pointers may present the order of different
+pointers differently across different runs of a program. In some cases this
+may be acceptable behavior, in others this may be unexpected behavior. This
+check is advisory for this reason.
+
+This check only detects range-based for loops over unordered sets. Other
+similar usages will not be found and are false negatives.
diff --git a/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h b/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h
new file mode 100644
index 00000000000000..b279a8f20d8ddd
--- /dev/null
+++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/system-header-simulator-cxx.h
@@ -0,0 +1,1450 @@
+// Like the compiler, the static analyzer treats some functions differently if
+// they come from a system header -- for example, it is assumed that system
+// functions do not arbitrarily free() their parameters, and that some bugs
+// found in system headers cannot be fixed by the user and should be
+// suppressed.
+#pragma clang system_header
+
+typedef unsigned char uint8_t;
+
+typedef __typeof__(sizeof(int)) size_t;
+typedef __typeof__((char*)0-(char*)0) ptrdiff_t;
+void *memmove(void *s1, const void *s2, size_t n);
+
+namespace std {
+  typedef size_t size_type;
+#if __cplusplus >= 201103L
+  using nullptr_t = decltype(nullptr);
+#endif
+}
+
+namespace std {
+  struct input_iterator_tag { };
+  struct output_iterator_tag { };
+  struct forward_iterator_tag : public input_iterator_tag { };
+  struct bidirectional_iterator_tag : public forward_iterator_tag { };
+  struct random_access_iterator_tag : public bidirectional_iterator_tag { };
+
+  template <typename Iterator> struct iterator_traits {
+    typedef typename Iterator::difference_type difference_type;
+    typedef typename Iterator::value_type value_type;
+    typedef typename Iterator::pointer pointer;
+    typedef typename Iterator::reference reference;
+    typedef typename Iterator::iterator_category iterator_category;
+  };
+}
+
+template <typename T, typename Ptr, typename Ref> struct __vector_iterator {
+  typedef __vector_iterator<T, T *, T &> iterator;
+  typedef __vector_iterator<T, const T *, const T &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::random_access_iterator_tag iterator_category;
+
+  __vector_iterator(const Ptr p = 0) : ptr(p) {}
+  __vector_iterator(const iterator &rhs): ptr(rhs.base()) {}
+  __vector_iterator<T, Ptr, Ref>& operator++() { ++ ptr; return *this; }
+  __vector_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    ++ ptr;
+    return tmp;
+  }
+  __vector_iterator<T, Ptr, Ref> operator--() { -- ptr; return *this; }
+  __vector_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this; -- ptr;
+    return tmp;
+  }
+  __vector_iterator<T, Ptr, Ref> operator+(difference_type n) {
+    return ptr + n;
+  }
+  friend __vector_iterator<T, Ptr, Ref> operator+(
+      difference_type n,
+      const __vector_iterator<T, Ptr, Ref> &iter) {
+    return n + iter.ptr;
+  }
+  __vector_iterator<T, Ptr, Ref> operator-(difference_type n) {
+    return ptr - n;
+  }
+  __vector_iterator<T, Ptr, Ref> operator+=(difference_type n) {
+    return ptr += n;
+  }
+  __vector_iterator<T, Ptr, Ref> operator-=(difference_type n) {
+    return ptr -= n;
+  }
+
+  template<typename U, typename Ptr2, typename Ref2>
+  difference_type operator-(const __vector_iterator<U, Ptr2, Ref2> &rhs);
+
+  Ref operator*() const { return *ptr; }
+  Ptr operator->() const { return ptr; }
+
+  Ref operator[](difference_type n) {
+    return *(ptr+n);
+  }
+
+  bool operator==(const iterator &rhs) const { return ptr == rhs.ptr; }
+  bool operator==(const const_iterator &rhs) const { return ptr == rhs.ptr; }
+
+  bool operator!=(const iterator &rhs) const { return ptr != rhs.ptr; }
+  bool operator!=(const const_iterator &rhs) const { return ptr != rhs.ptr; }
+
+  const Ptr& base() const { return ptr; }
+
+private:
+  Ptr ptr;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __deque_iterator {
+  typedef __deque_iterator<T, T *, T &> iterator;
+  typedef __deque_iterator<T, const T *, const T &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::random_access_iterator_tag iterator_category;
+
+  __deque_iterator(const Ptr p = 0) : ptr(p) {}
+  __deque_iterator(const iterator &rhs): ptr(rhs.base()) {}
+  __deque_iterator<T, Ptr, Ref>& operator++() { ++ ptr; return *this; }
+  __deque_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    ++ ptr;
+    return tmp;
+  }
+  __deque_iterator<T, Ptr, Ref> operator--() { -- ptr; return *this; }
+  __deque_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this; -- ptr;
+    return tmp;
+  }
+  __deque_iterator<T, Ptr, Ref> operator+(difference_type n) {
+    return ptr + n;
+  }
+  friend __deque_iterator<T, Ptr, Ref> operator+(
+      difference_type n,
+      const __deque_iterator<T, Ptr, Ref> &iter) {
+    return n + iter.ptr;
+  }
+  __deque_iterator<T, Ptr, Ref> operator-(difference_type n) {
+    return ptr - n;
+  }
+  __deque_iterator<T, Ptr, Ref> operator+=(difference_type n) {
+    return ptr += n;
+  }
+  __deque_iterator<T, Ptr, Ref> operator-=(difference_type n) {
+    return ptr -= n;
+  }
+
+  Ref operator*() const { return *ptr; }
+  Ptr operator->() const { return ptr; }
+
+  Ref operator[](difference_type n) {
+    return *(ptr+n);
+  }
+
+  bool operator==(const iterator &rhs) const { return ptr == rhs.ptr; }
+  bool operator==(const const_iterator &rhs) const { return ptr == rhs.ptr; }
+
+  bool operator!=(const iterator &rhs) const { return ptr != rhs.ptr; }
+  bool operator!=(const const_iterator &rhs) const { return ptr != rhs.ptr; }
+
+  const Ptr& base() const { return ptr; }
+
+private:
+  Ptr ptr;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __list_iterator {
+  typedef __list_iterator<T, __typeof__(T::data) *, __typeof__(T::data) &> iterator;
+  typedef __list_iterator<T, const __typeof__(T::data) *, const __typeof__(T::data) &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::bidirectional_iterator_tag iterator_category;
+
+  __list_iterator(T* it = 0) : item(it) {}
+  __list_iterator(const iterator &rhs): item(rhs.item) {}
+  __list_iterator<T, Ptr, Ref>& operator++() { item = item->next; return *this; }
+  __list_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    item = item->next;
+    return tmp;
+  }
+  __list_iterator<T, Ptr, Ref> operator--() { item = item->prev; return *this; }
+  __list_iterator<T, Ptr, Ref> operator--(int) {
+    auto tmp = *this;
+    item = item->prev;
+    return tmp;
+  }
+
+  Ref operator*() const { return item->data; }
+  Ptr operator->() const { return &item->data; }
+
+  bool operator==(const iterator &rhs) const { return item == rhs->item; }
+  bool operator==(const const_iterator &rhs) const { return item == rhs->item; }
+
+  bool operator!=(const iterator &rhs) const { return item != rhs->item; }
+  bool operator!=(const const_iterator &rhs) const { return item != rhs->item; }
+
+  const T* &base() const { return item; }
+
+  template <typename UT, typename UPtr, typename URef>
+  friend struct __list_iterator;
+
+private:
+  T* item;
+};
+
+template <typename T, typename Ptr, typename Ref> struct __fwdl_iterator {
+  typedef __fwdl_iterator<T, __typeof__(T::data) *, __typeof__(T::data) &> iterator;
+  typedef __fwdl_iterator<T, const __typeof__(T::data) *, const __typeof__(T::data) &> const_iterator;
+
+  typedef ptrdiff_t difference_type;
+  typedef T value_type;
+  typedef Ptr pointer;
+  typedef Ref reference;
+  typedef std::forward_iterator_tag iterator_category;
+
+  __fwdl_iterator(T* it = 0) : item(it) {}
+  __fwdl_iterator(const iterator &rhs): item(rhs.item) {}
+  __fwdl_iterator<T, Ptr, Ref>& operator++() { item = item->next; return *this; }
+  __fwdl_iterator<T, Ptr, Ref> operator++(int) {
+    auto tmp = *this;
+    item = item->next;
+    return tmp;
+  }
+  Ref operator*() const { return item->data; }
+  Ptr operator->() const { return &item->data; }
+
+  bool operator==(const iterator &rhs) const { return item == rhs->item; }
+  bool operator==(const const_iterator &rhs) const { return item == rhs->item; }
+
+  bool operator!=(const iterator &rhs) const { return item != rhs->item; }
+  bool operator!=(const const_iterator &rhs) const { return item != rhs->item; }
+
+  const T* &base() const { return item; }
+
+  template <typename UT, typename UPtr, typename URef>
+  friend struct __fwdl_iterator;
+
+private:
+  T* item;
+};
+
+namespace std {
+  template <class T1, class T2>
+  struct pair {
+    T1 first;
+    T2 second;
+
+    pair() : first(), second() {}
+    pair(const T1 &a, const T2 &b) : first(a), second(b) {}
+
+    template<class U1, class U2>
+    pair(const pair<U1, U2> &other) : first(other.first),
+                                      second(other.second) {}
+  };
+
+  template<class T2, class T1>
+  T2& get(pair<T1, T2>& p) ;
+  template<class T1, class T2>
+  T1& get(const pair<T1, T2>& p) ;
+
+  typedef __typeof__(sizeof(int)) size_t;
+
+  template <class T> class initializer_list;
+
+  template< class T > struct remove_reference      {typedef T type;};
+  template< class T > struct remove_reference<T&>  {typedef T type;};
+  template< class T > struct remove_reference<T&&> {typedef T type;};
+
+  template<typename T> typename remove_reference<T>::type&& move(T&& a);
+  template <typename T> T *__addressof(T &x);
+  template <typename T> T *addressof(T &x);
+  template <typename T> const T& as_const(T& x);
+  template <typename T> T&& forward(T&& x);
+  // FIXME: Declare forward_like
+  // FIXME: Declare move_if_noexcept
+
+  template< class T >
+  using remove_reference_t = typename remove_reference<T>::type;
+
+  template <class T>
+  void swap(T &a, T &b) {
+    T c(std::move(a));
+    a = std::move(b);
+    b = std::move(c);
+  }
+
+  template<typename T>
+  class vector {
+    T *_start;
+    T *_finish;
+    T *_end_of_storage;
+
+  public:
+    typedef T value_type;
+    typedef size_t size_type;
+    typedef __vector_iterator<T, T *, T &> iterator;
+    typedef __vector_iterator<T, const T *, const T &> const_iterator;
+
+    vector() : _start(0), _finish(0), _end_of_storage(0) {}
+    template <typename InputIterator>
+    vector(InputIterator first, InputIterator last);
+    vector(const vector &other);
+    vector(vector &&other);
+    ~vector();
+
+    size_t size() const {
+      return size_t(_finish - _start);
+    }
+
+    vector& operator=(const vector &other);
+    vector& operator=(vector &&other);
+    vector& operator=(std::initializer_list<T> ilist);
+
+    void assign(size_type count, const T &value);
+    template <typename InputIterator >
+    void assign(InputIterator first, InputIterator last);
+    void assign(std::initializer_list<T> ilist);
+
+    void clear();
+
+    void push_back(const T &value);
+    void push_back(T &&value);
+    template<class... Args>
+    void emplace_back(Args&&... args);
+    void pop_back();
+
+    iterator insert(const_iterator position, const value_type &val);
+    iterator insert(const_iterator position, size_type n,
+                    const value_type &val);
+    template <typename InputIterator>
+    iterator insert(const_iterator position, InputIterator first,
+                    InputIterator last);
+    iterator insert(const_iterator position, value_type &&val);
+    iterator insert(const_iterator position, initializer_list<value_type> il);
+
+    template <class... Args>
+    iterator emplace(const_iterator position, Args&&... args);
+
+    iterator erase(const_iterator position);
+    iterator erase(const_iterator first, const_iterator last);
+
+    T &operator[](size_t n) {
+      return _start[n];
+    }
+
+    const T &operator[](size_t n) const {
+      return _start[n];
+    }
+
+    iterator begin() { return iterator(_start); }
+    const_iterator begin() const { return const_iterator(_start); }
+    const_iterator cbegin() const { return const_iterator(_start); }
+    iterator end() { return iterator(_finish); }
+    const_iterator end() const { return const_iterator(_finish); }
+    const_iterator cend() const { return const_iterator(_finish); }
+    T& front() { return *begin(); }
+    const T& front() const { return *begin(); }
+    T& back() { return *(end() - 1); }
+    const T& back() const { re...
[truncated]

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we had something like this. It might be possible, but really really challenging to implement.
To me the problem is that the iteration or lookups are not necessarily bad. And to determine if it's bad one needs to understand how the result of the lookup or the individual elements of the iteration are used.

For example, this is completely fine and efficient:

int numInterestingExprs = 0; // TODO: Use accumulate.
for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) ++numInterestingExprs;
}

But this is not fine, because we apply a sideffect that is dependent on the nondeterministic iteration order. In some context, this could be still fine btw. It depends on what your constraints are:

for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) llvm::errs() << E->printPretty() << "\n"; // eeeh, the lines may get reshuffled from run-to-run...
}

Only looking at the for loop here is not enough to decide this.


Limiting the check to only trigger if it sees the iteration AND also the use-site - like in the case you match std algorithms, like is_sorted or partition is a much more careful approach. It should work unless the algorithm takes a callback/functor that we can't really reason about and we could end up with a FP.

Remember that begin might be an ADL free-function, and not a memer function, such as for int *arr[10]. C++ also has std::ranges::*, which we may wanna cover too in the future.


You mentioned that you evaluated the checker at scale. What was the result of the evaluation? I assume you have seen FPs, and TPs as well.

Copy link
Contributor

@EugeneZelenko EugeneZelenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention new check in Release Notes.

@whisperity whisperity marked this pull request as draft September 30, 2024 19:45
@whisperity
Copy link
Member

Q) I ended up expanding upon the C++ simulation header, but had thoughts about breaking that up into multiple smaller files. Maybe there's an opportunity to refactor some C++ simulation files across multiple checkers as a seperate PR first, or maybe even as part of this one?

@vabridgers Definitely do such things in separate PRs. It makes the current patch cleaner if the additions to the sysroot simulation only contains what is needed for your patch.

Copy link
Member

@whisperity whisperity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also, for reference, I dug out the original CSA checkers' initial reviews from the history: https://reviews.llvm.org/D50488 and https://reviews.llvm.org/D59279.)

Even though the removed checks are alpha, maybe their removal is worth a mention in the release notes for CSA, if we have such a thing in the first place. @Szelethus?

@whisperity
Copy link
Member

And to determine if it's bad one needs to understand how the result of the lookup or the individual elements of the iteration are used.

And the issue here is that, in the extreme case, even a completely omniscient call graph oracle would still fail at proving immunity to non-determinism even in your second case, because the determinism requirement could come from something like FileCheck ran externally to the code that executed a non-deterministic order.

I'm unsure as to what the right approach would be here. I was thinking about instead marking every variable or field declared with a type like known-unordered-container-type <T*> instead of the for loops themselves, but then we are essentially trying to say that this partial specialisation is "deprecated" in a way... "you shouldn't put pointers into this because no matter what you do, it will have a non-deterministic order"... and???

But your second example, @steakhal, touches on the problem nicely. What if we tried to match archetypes of "container elements are printed" or "container elements are sent to the network"? It will never be a complete list, but we could reasonably assume (at least this is a safe-ish generalisation, and for everything else, there is NOLINT...) that if the order escapes via some I/O, then non-determinism is... maybe if not outright bad, but definitely uncanny, if for nothing else, due to the troubles it causes when debugging. 😉

@vabridgers
Copy link
Contributor Author

I wish we had something like this. It might be possible, but really really challenging to implement. To me the problem is that the iteration or lookups are not necessarily bad. And to determine if it's bad one needs to understand how the result of the lookup or the individual elements of the iteration are used.

For example, this is completely fine and efficient:

int numInterestingExprs = 0; // TODO: Use accumulate.
for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) ++numInterestingExprs;
}

But this is not fine, because we apply a sideffect that is dependent on the nondeterministic iteration order. In some context, this could be still fine btw. It depends on what your constraints are:

for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) llvm::errs() << E->printPretty() << "\n"; // eeeh, the lines may get reshuffled from run-to-run...
}

Only looking at the for loop here is not enough to decide this.

Limiting the check to only trigger if it sees the iteration AND also the use-site - like in the case you match std algorithms, like is_sorted or partition is a much more careful approach. It should work unless the algorithm takes a callback/functor that we can't really reason about and we could end up with a FP.

Remember that begin might be an ADL free-function, and not a memer function, such as for int *arr[10]. C++ also has std::ranges::*, which we may wanna cover too in the future.

You mentioned that you evaluated the checker at scale. What was the result of the evaluation? I assume you have seen FPs, and TPs as well.

Hi @steakhal , I completed a test run on the LLVM source base and detected no issues with this check. I will be expanding the tests, and wanted to mention I just found

I wish we had something like this. It might be possible, but really really challenging to implement. To me the problem is that the iteration or lookups are not necessarily bad. And to determine if it's bad one needs to understand how the result of the lookup or the individual elements of the iteration are used.

For example, this is completely fine and efficient:

int numInterestingExprs = 0; // TODO: Use accumulate.
for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) ++numInterestingExprs;
}

But this is not fine, because we apply a sideffect that is dependent on the nondeterministic iteration order. In some context, this could be still fine btw. It depends on what your constraints are:

for (const Expr *E : MyBlocks/*some unordered set-like container*/) {
  if (isInteresting(E)) llvm::errs() << E->printPretty() << "\n"; // eeeh, the lines may get reshuffled from run-to-run...
}

Only looking at the for loop here is not enough to decide this.

Limiting the check to only trigger if it sees the iteration AND also the use-site - like in the case you match std algorithms, like is_sorted or partition is a much more careful approach. It should work unless the algorithm takes a callback/functor that we can't really reason about and we could end up with a FP.

Remember that begin might be an ADL free-function, and not a memer function, such as for int *arr[10]. C++ also has std::ranges::*, which we may wanna cover too in the future.

You mentioned that you evaluated the checker at scale. What was the result of the evaluation? I assume you have seen FPs, and TPs as well.

I completed an initial test run on the llvm/clang repo with no crashes or results at all. Meaning no FPs or TPs. I will be expanding the tests to more repos, and would love to hear suggestions if you have any. I suppose abseil would be a good one. Also just found @Xazax-hun 's testbench to help with this here - https://github.com/Xazax-hun/csa-testbench. Really looking forward to trying that :)

Thanks for the comments. Will keep driving this forward.

@carlosgalvezp
Copy link
Contributor

carlosgalvezp commented Sep 30, 2024

creation of an advisory group.

Why isn't misc suitable for this use case? It would be confusing to have both groups I think, and may lead to bike-shedding discussions in the future about whether a check belongs in misc or advisory.

@vabridgers
Copy link
Contributor Author

Why isn't misc suitable for this use case?

bugprone was just an initial thought. If the group think leads us to misc I'm ok with moving in that direction.
Other thoughts?

@carlosgalvezp
Copy link
Contributor

Why isn't misc suitable for this use case?

bugprone was just an initial thought. If the group think leads us to misc I'm ok with moving in that direction. Other thoughts?

Maybe i misunderstood, I thought you meant creating a new clang-tidy module called advisory. bugprone sounds good to me!

@vabridgers
Copy link
Contributor Author

vabridgers commented Oct 6, 2024

As I understand the collective comments so far:

  • bugprone group is fine. I've heard no objections to bugprone so will stay with that.
  • Break up the simulation header files, only include what's actually needed for the test
  • change the checker name to something like bugprone-nondeterministic-pointer-iteration-order
  • provide testing results, expand upon testing
  • Update the documentation to meet requirements
  • I've heard no concrete comments on expanding beyond the few cases this check currently addresses. I'll assume that's ok for now.
  • While there has been at least one comment to add additional patterns, I'd like to keep the scope limited for this PR to just moving the static analysis checks to a single clang-tidy check. Would that be ok with the reviewers?

Next update should address these points. Thanks for the comments!

@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch from 52e3908 to 4d8d105 Compare October 6, 2024 13:31
@vabridgers
Copy link
Contributor Author

Hi all, I believe all comments have been addressed. Thanks.

Copy link

github-actions bot commented Oct 6, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

…tidy

This change moves the alpha.nondeterministic.PointerSorting and
alpha.nondeterministic.PointerIteration static analyzer checkers to
a single clang-tidy check. Those checkers were implemented as clang-tidy
checks wrapped in the static analyzer framework. The documentation was
updated to describe what the checks can and cannot do, and testing
was completed on a broad set of open source projects.
@vabridgers vabridgers force-pushed the nondeterministic-pointer-usage branch from 4d8d105 to 3ab0347 Compare October 6, 2024 13:39
@vabridgers vabridgers marked this pull request as ready for review October 6, 2024 14:03
@vabridgers vabridgers changed the title RFC: [clang-tidy] [analyzer] Nondeterministic pointer usage improvements RFC: [clang-tidy] [analyzer] Move nondeterministic pointer usage check to tidy Oct 6, 2024
Copy link
Contributor

@5chmidti 5chmidti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added comments for cleaning up the implementation, and also questions/suggestions for future enhancements that are prefixed with ~:. I went with basically-NFC-cleanups, instead of improvements, in order to have this PR stay a moving to clang-tidy PR but improving the implementation quality.
There are various points that should be addressed to enhance the check after this lands, and we should think about better detection rates, as discussed above, some more.

Comment on lines +178 to +179
CheckFactories.registerCheck<NondeterministicPointerUsageCheck>(
"bugprone-nondeterministic-pointer-iteration-order");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class and the check have different names, please choose one as that is an expectable pattern, and helps in navigating the files.
In case of the class and files needing a rename, checkout https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clang-tidy/rename_check.py)

Comment on lines +21 to +24
auto RangeInit = declRefExpr(to(varDecl(hasType(recordDecl(
anyOf(hasName("std::unordered_set"), hasName("std::unordered_map"),
hasName("std::unordered_multiset"),
hasName("std::unordered_multimap")))))));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change these multiple hasName matchers to the hasAnyName matcher (variadic)

Comment on lines +32 to +38
auto SortFuncM = anyOf(callee(functionDecl(hasName("std::is_sorted"))),
callee(functionDecl(hasName("std::nth_element"))),
callee(functionDecl(hasName("std::sort"))),
callee(functionDecl(hasName("std::partial_sort"))),
callee(functionDecl(hasName("std::partition"))),
callee(functionDecl(hasName("std::stable_partition"))),
callee(functionDecl(hasName("std::stable_sort"))));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above with hasName, but this can also be replaced with

auto SortFuncM = callee(functionDecl(hasAnyName(...)))


auto LoopVariable = varDecl(hasType(hasCanonicalType(pointerType())));

auto RangeInit = declRefExpr(to(varDecl(hasType(recordDecl(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~: If you use varDecl(hasType(qualType(hasCanonicalType(...)))), variables where the type is an alias to one of these, will also be found


auto IteratesPointerEltsM = hasArgument(
0,
cxxMemberCallExpr(on(hasType(cxxRecordDecl(has(fieldDecl(hasType(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~: hasType -> hasType(qualType(hasCanonicalType( with the same aliasing reasoning as above

Key *ptr;
};

public:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not needed public

Comment on lines +18 to +27
class iterator {
public:
iterator(Key *key): ptr(key) {}
iterator& operator++() { ++ptr; return *this; }
bool operator!=(const iterator &other) const { return ptr != other.ptr; }
const Key &operator*() const { return *ptr; }
private:
Key *ptr;
};
public:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Weird indentation + not needed public in ln 27

Key *ptr;
};

public:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: not needed public

@@ -121,6 +121,12 @@ New checks
Gives warnings for tagged unions, where the number of tags is
different from the number of data members inside the union.

- New :doc:`bugprone-nondeterministic-pointer-iteration-order
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sort this in alphabetically, so one entry up

Comment on lines +37 to +38
for (auto &i : UnorderedPtrSet) // no-warning
f(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~: Don't we want to detect this case as well?

@5chmidti
Copy link
Contributor

5chmidti commented Oct 8, 2024

I'd like to keep the scope limited for this PR to just moving the static analysis checks to a single clang-tidy check. Would that be ok with the reviewers?

Fine with me, yes, but there needs to be some sort of plan on how to address some of the pointed out challenges with this check, that were raised by others.


It might be possible, but really really challenging to implement.
...
Only looking at the for loop here is not enough ...

+1

because we apply a sideffect that is dependent on the nondeterministic iteration order

What if we tried to match archetypes of "container elements are printed" or "container elements are sent to the network"? It will never be a complete list, but we could reasonably assume (at least this is a safe-ish generalisation, and for everything else, there is NOLINT...) that if the order escapes via some I/O, then non-determinism is... maybe if not outright bad, but definitely uncanny, if for nothing else, due to the troubles it causes when debugging.

That's of course possible, but may grow to 'too many' patterns fast, and then there comes the question of which ones to add and which are considered to be too infrequent. The other issue is that doing this would mean that either clang-tidy has to encode the use of many different libraries, and how side effects can happen by using them, or by providing customization options, which would in turn be limited by the patterns that are considered generic enough to warrant the customizable detection pattern (e.g., printing and formatting, and then there are formatToString member functions as well, that e.g., print into a preallocated buffer).

Maybe Expr::hasSideEffets can be used as a baseline?
I.e.,

for all references r to var of type unordered map
  traverse up the AST until the top-level expression e (not stmt) is found (containing r)
    if e->hasSideEffects()
      diagnose()

(maybe not literally, to avoid using the parent map too much for traversal)

For calling functions, there could also be a (cached) lookup of their bodies (if available), using Expr::hasSideEffects for the DeclRefExpr of the parameter. bugprone-exception-escape is doing something similar, but for throw statements, instead of checking for variable uses with side effects.

@vabridgers
Copy link
Contributor Author

@5chmidti , thanks for the comments. I'll start to work through those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category clang-tidy clang-tools-extra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants