Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang-tidy] Only expand <inttypes.h> macros in modernize-use-std-format/print #97911

Merged
merged 2 commits into from
Oct 8, 2024

Conversation

mikecrowe
Copy link
Contributor

@mikecrowe mikecrowe commented Jul 6, 2024

Expanding all macros in the printf/absl::StrFormat format string before conversion could easily break code if those macros are expanded change their definition between builds. It's important for this check to expand the <inttypes.h> PRI macros though, so let's ensure that the presence of any other macros in the format string causes the check to emit a warning and not perform any conversion.

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 6, 2024

@llvm/pr-subscribers-clang-tools-extra

Author: Mike Crowe (mikecrowe)

Changes

Expanding all macros in the printf/absl::StrFormat format string before conversion could easily break code if those macros are expended to change their definition between builds. It's important for this check to expand the <inttypes.h> PRI macros though, so let's ensure that the presence of any other macros in the format string causes the check to emit a warning and not perform any conversion.


Patch is 21.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97911.diff

11 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.cpp (+4-3)
  • (modified) clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.h (+1)
  • (modified) clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.cpp (+3-1)
  • (modified) clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.h (+1)
  • (modified) clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp (+47-5)
  • (modified) clang-tools-extra/clang-tidy/utils/FormatStringConverter.h (+5-1)
  • (modified) clang-tools-extra/docs/ReleaseNotes.rst (+2-1)
  • (modified) clang-tools-extra/docs/clang-tidy/checks/modernize/use-std-print.rst (+13-9)
  • (modified) clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/inttypes.h (+16-10)
  • (modified) clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp (+44-9)
  • (modified) clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print.cpp (+45-2)
diff --git a/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.cpp b/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.cpp
index d082faa786b37..7e8cbd40fe07a 100644
--- a/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.cpp
@@ -44,6 +44,7 @@ void UseStdFormatCheck::registerPPCallbacks(const SourceManager &SM,
                                             Preprocessor *PP,
                                             Preprocessor *ModuleExpanderPP) {
   IncludeInserter.registerPreprocessor(PP);
+  this->PP = PP;
 }
 
 void UseStdFormatCheck::registerMatchers(MatchFinder *Finder) {
@@ -78,9 +79,9 @@ void UseStdFormatCheck::check(const MatchFinder::MatchResult &Result) {
 
   utils::FormatStringConverter::Configuration ConverterConfig;
   ConverterConfig.StrictMode = StrictMode;
-  utils::FormatStringConverter Converter(Result.Context, StrFormat,
-                                         FormatArgOffset, ConverterConfig,
-                                         getLangOpts());
+  utils::FormatStringConverter Converter(
+      Result.Context, StrFormat, FormatArgOffset, ConverterConfig,
+      getLangOpts(), *Result.SourceManager, *PP);
   const Expr *StrFormatCall = StrFormat->getCallee();
   if (!Converter.canApply()) {
     diag(StrFormat->getBeginLoc(),
diff --git a/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.h b/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.h
index b59a4708c6e4b..9ac2240212ebf 100644
--- a/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.h
+++ b/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.h
@@ -44,6 +44,7 @@ class UseStdFormatCheck : public ClangTidyCheck {
   StringRef ReplacementFormatFunction;
   utils::IncludeInserter IncludeInserter;
   std::optional<StringRef> MaybeHeaderToInclude;
+  Preprocessor *PP = nullptr;
 };
 
 } // namespace clang::tidy::modernize
diff --git a/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.cpp b/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.cpp
index 1ea170c3cd310..69136c10d927b 100644
--- a/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.cpp
@@ -68,6 +68,7 @@ void UseStdPrintCheck::registerPPCallbacks(const SourceManager &SM,
                                            Preprocessor *PP,
                                            Preprocessor *ModuleExpanderPP) {
   IncludeInserter.registerPreprocessor(PP);
+  this->PP = PP;
 }
 
 static clang::ast_matchers::StatementMatcher
@@ -137,7 +138,8 @@ void UseStdPrintCheck::check(const MatchFinder::MatchResult &Result) {
   ConverterConfig.StrictMode = StrictMode;
   ConverterConfig.AllowTrailingNewlineRemoval = true;
   utils::FormatStringConverter Converter(
-      Result.Context, Printf, FormatArgOffset, ConverterConfig, getLangOpts());
+      Result.Context, Printf, FormatArgOffset, ConverterConfig, getLangOpts(),
+      *Result.SourceManager, *PP);
   const Expr *PrintfCall = Printf->getCallee();
   const StringRef ReplacementFunction = Converter.usePrintNewlineFunction()
                                             ? ReplacementPrintlnFunction
diff --git a/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.h b/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.h
index 7a06cf38b4264..995c740389e73 100644
--- a/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.h
+++ b/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.h
@@ -36,6 +36,7 @@ class UseStdPrintCheck : public ClangTidyCheck {
   }
 
 private:
+  Preprocessor *PP;
   bool StrictMode;
   std::vector<StringRef> PrintfLikeFunctions;
   std::vector<StringRef> FprintfLikeFunctions;
diff --git a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
index 33f3ea47df1e3..686c8fb71fae5 100644
--- a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
+++ b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
@@ -18,6 +18,7 @@
 #include "clang/ASTMatchers/ASTMatchFinder.h"
 #include "clang/Basic/LangOptions.h"
 #include "clang/Lex/Lexer.h"
+#include "clang/Lex/Preprocessor.h"
 #include "clang/Tooling/FixIt.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/Support/Debug.h"
@@ -195,11 +196,10 @@ static bool castMismatchedIntegerTypes(const CallExpr *Call, bool StrictMode) {
   return false;
 }
 
-FormatStringConverter::FormatStringConverter(ASTContext *ContextIn,
-                                             const CallExpr *Call,
-                                             unsigned FormatArgOffset,
-                                             const Configuration ConfigIn,
-                                             const LangOptions &LO)
+FormatStringConverter::FormatStringConverter(
+    ASTContext *ContextIn, const CallExpr *Call, unsigned FormatArgOffset,
+    const Configuration ConfigIn, const LangOptions &LO, SourceManager &SM,
+    Preprocessor &PP)
     : Context(ContextIn), Config(ConfigIn),
       CastMismatchedIntegerTypes(
           castMismatchedIntegerTypes(Call, ConfigIn.StrictMode)),
@@ -208,11 +208,22 @@ FormatStringConverter::FormatStringConverter(ASTContext *ContextIn,
   assert(ArgsOffset <= NumArgs);
   FormatExpr = llvm::dyn_cast<StringLiteral>(
       Args[FormatArgOffset]->IgnoreImplicitAsWritten());
+
   if (!FormatExpr || !FormatExpr->isOrdinary()) {
     // Function must have a narrow string literal as its first argument.
     conversionNotPossible("first argument is not a narrow string literal");
     return;
   }
+
+  if (const auto MaybeMacroName =
+          formatStringContainsUnreplaceableMacro(FormatExpr, SM, PP);
+      MaybeMacroName) {
+    conversionNotPossible(
+        ("format string contains unreplaceable macro '" + *MaybeMacroName + "'")
+            .str());
+    return;
+  }
+
   PrintfFormatString = FormatExpr->getString();
 
   // Assume that the output will be approximately the same size as the input,
@@ -230,6 +241,37 @@ FormatStringConverter::FormatStringConverter(ASTContext *ContextIn,
   finalizeFormatText();
 }
 
+std::optional<StringRef>
+FormatStringConverter::formatStringContainsUnreplaceableMacro(
+    const StringLiteral *FormatExpr, SourceManager &SM, Preprocessor &PP) {
+  for (auto I = FormatExpr->tokloc_begin(), E = FormatExpr->tokloc_end();
+       I != E; ++I) {
+    const SourceLocation &TokenLoc = *I;
+    if (TokenLoc.isMacroID()) {
+      const StringRef MacroName =
+          Lexer::getImmediateMacroName(TokenLoc, SM, PP.getLangOpts());
+
+      // glibc uses __PRI64_PREFIX and __PRIPTR_PREFIX to define the prefixes
+      // for types that change size so we must look for multiple prefixes.
+      if (!MacroName.starts_with("PRI") && !MacroName.starts_with("__PRI"))
+        return MacroName;
+
+      const SourceLocation TokenSpellingLoc = SM.getSpellingLoc(TokenLoc);
+      const OptionalFileEntryRef MaybeFileEntry =
+          SM.getFileEntryRefForID(SM.getFileID(TokenSpellingLoc));
+      if (!MaybeFileEntry)
+        return MacroName;
+
+      HeaderSearch &HS = PP.getHeaderSearchInfo();
+      // Check if the file is a system header
+      if (!isSystem(HS.getFileDirFlavor(*MaybeFileEntry)) ||
+          llvm::sys::path::filename(MaybeFileEntry->getName()) != "inttypes.h")
+        return MacroName;
+    }
+  }
+  return std::nullopt;
+}
+
 void FormatStringConverter::emitAlignment(const PrintfSpecifier &FS,
                                           std::string &FormatSpec) {
   ConversionSpecifier::Kind ArgKind = FS.getConversionSpecifier().getKind();
diff --git a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.h b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.h
index 1109a0b602262..5d4b694647aad 100644
--- a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.h
+++ b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.h
@@ -40,7 +40,8 @@ class FormatStringConverter
 
   FormatStringConverter(ASTContext *Context, const CallExpr *Call,
                         unsigned FormatArgOffset, Configuration Config,
-                        const LangOptions &LO);
+                        const LangOptions &LO, SourceManager &SM,
+                        Preprocessor &PP);
 
   bool canApply() const { return ConversionNotPossibleReason.empty(); }
   const std::string &conversionNotPossibleReason() const {
@@ -110,6 +111,9 @@ class FormatStringConverter
 
   void appendFormatText(StringRef Text);
   void finalizeFormatText();
+  static std::optional<StringRef>
+  formatStringContainsUnreplaceableMacro(const StringLiteral *FormatExpr,
+                                         SourceManager &SM, Preprocessor &PP);
   bool conversionNotPossible(std::string Reason) {
     ConversionNotPossibleReason = std::move(Reason);
     return false;
diff --git a/clang-tools-extra/docs/ReleaseNotes.rst b/clang-tools-extra/docs/ReleaseNotes.rst
index e570c8184f8b0..36f8edbcaa73e 100644
--- a/clang-tools-extra/docs/ReleaseNotes.rst
+++ b/clang-tools-extra/docs/ReleaseNotes.rst
@@ -396,7 +396,8 @@ Changes in existing checks
 - Improved :doc:`modernize-use-std-print
   <clang-tidy/checks/modernize/use-std-print>` check to not crash if the
   format string parameter of the function to be replaced is not of the
-  expected type.
+  expected type. Only macros starting with ``PRI`` and ``__PRI`` from
+  ``<inttypes.h>`` are now expanded in the format string.
 
 - Improved :doc:`modernize-use-using <clang-tidy/checks/modernize/use-using>`
   check by adding support for detection of typedefs declared on function level.
diff --git a/clang-tools-extra/docs/clang-tidy/checks/modernize/use-std-print.rst b/clang-tools-extra/docs/clang-tidy/checks/modernize/use-std-print.rst
index 79648a1104bca..1e33d347f65a0 100644
--- a/clang-tools-extra/docs/clang-tidy/checks/modernize/use-std-print.rst
+++ b/clang-tools-extra/docs/clang-tidy/checks/modernize/use-std-print.rst
@@ -24,8 +24,15 @@ into:
   std::println(stderr, "The {} is {:3}", description, value);
 
 If the `ReplacementPrintFunction` or `ReplacementPrintlnFunction` options
-are left, or assigned to their default values then this check is only
-enabled with `-std=c++23` or later.
+are left at or set to their default values then this check is only enabled
+with `-std=c++23` or later.
+
+Macros starting with ``PRI`` and ``__PRI`` from `<inttypes.h>` are
+expanded, escaping is handled and adjacent strings are concatenated to form
+a single ``StringLiteral`` before the format string is converted. Use of
+any other macros in the format string will cause a warning message to be
+emitted and no conversion will be performed. The resultant converted format
+string will always be a single string literal.
 
 The check doesn't do a bad job, but it's not perfect. In particular:
 
@@ -34,13 +41,10 @@ The check doesn't do a bad job, but it's not perfect. In particular:
   possible.
 
 - At the point that the check runs, the AST contains a single
-  ``StringLiteral`` for the format string and any macro expansion, token
-  pasting, adjacent string literal concatenation and escaping has been
-  handled. Although it's possible for the check to automatically put the
-  escapes back, they may not be exactly as they were written (e.g.
-  ``"\x0a"`` will become ``"\n"`` and ``"ab" "cd"`` will become
-  ``"abcd"``.) This is helpful since it means that the ``PRIx`` macros from
-  ``<inttypes.h>`` are removed correctly.
+  ``StringLiteral`` for the format string where escapes have been expanded.
+  The check tries to put the escapes back, they may not be exactly as they
+  were written (e.g. ``"\x41\x0a"`` will become ``"A\n"`` and ``"ab" "cd"``
+  will become ``"abcd"``.)
 
 - It supports field widths, precision, positional arguments, leading zeros,
   leading ``+``, alignment and alternative forms.
diff --git a/clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/inttypes.h b/clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/inttypes.h
index 9dc7ae39b3a3f..74437f405931b 100644
--- a/clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/inttypes.h
+++ b/clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/inttypes.h
@@ -21,40 +21,46 @@ typedef __UINT32_TYPE__ uint32_t;
 typedef __UINT16_TYPE__ uint16_t;
 typedef __UINT8_TYPE__ uint8_t;
 
-#define PRIdMAX "lld"
-#define PRId64 "lld"
+#if __WORDSIZE == 64
+# define __PRI64_PREFIX	"l"
+#else
+# define __PRI64_PREFIX	"ll"
+#endif
+
+#define PRIdMAX __PRI64_PREFIX "d"
+#define PRId64 __PRI64_PREFIX "d"
 #define PRId32 "d"
 #define PRId16 "hd"
 #define PRId8  "hhd"
 
-#define PRIiMAX "lli"
-#define PRIi64 "lli"
+#define PRIiMAX __PRI64_PREFIX "i"
+#define PRIi64 __PRI64_PREFIX "i"
 #define PRIi32 "i"
 #define PRIi16 "hi"
 #define PRIi8  "hhi"
 
-#define PRIiFAST64 "lli"
+#define PRIiFAST64 __PRI64_PREFIX "i"
 #define PRIiFAST32 "i"
 #define PRIiFAST16 "hi"
 #define PRIiFAST8  "hhi"
 
-#define PRIiLEAST64 "lli"
+#define PRIiLEAST64 __PRI64_PREFIX "i"
 #define PRIiLEAST32 "i"
 #define PRIiLEAST16 "hi"
 #define PRIiLEAST8  "hhi"
 
-#define PRIuMAX "llu"
-#define PRIu64 "llu"
+#define PRIuMAX __PRI64_PREFIX "u"
+#define PRIu64 __PRI64_PREFIX "u"
 #define PRIu32 "u"
 #define PRIu16 "hu"
 #define PRIu8  "hhu"
 
-#define PRIuFAST64 "llu"
+#define PRIuFAST64 __PRI64_PREFIX "u"
 #define PRIuFAST32 "u"
 #define PRIuFAST16 "hu"
 #define PRIuFAST8  "hhu"
 
-#define PRIuLEAST64 "llu"
+#define PRIuLEAST64 __PRI64_PREFIX "u"
 #define PRIuLEAST32 "u"
 #define PRIuLEAST16 "hu"
 #define PRIuLEAST8  "hhu"
diff --git a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp
index e8dea1dce2c97..09e67c7397f00 100644
--- a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp
+++ b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp
@@ -1,13 +1,18 @@
 // RUN: %check_clang_tidy \
 // RUN:   -std=c++20 %s modernize-use-std-format %t -- \
 // RUN:   -config="{CheckOptions: {StrictMode: true}}" \
-// RUN:   -- -isystem %clang_tidy_headers
+// RUN:   -- -isystem %clang_tidy_headers \
+// RUN:      -DPRI_CMDLINE_MACRO="\"%s\"" \
+// RUN:      -D__PRI_CMDLINE_MACRO="\"%s\""
 // RUN: %check_clang_tidy \
 // RUN:   -std=c++20 %s modernize-use-std-format %t -- \
 // RUN:   -config="{CheckOptions: {StrictMode: false}}" \
-// RUN:   -- -isystem %clang_tidy_headers
+// RUN:   -- -isystem %clang_tidy_headers \
+// RUN:      -DPRI_CMDLINE_MACRO="\"%s\"" \
+// RUN:      -D__PRI_CMDLINE_MACRO="\"%s\""
 #include <string>
 // CHECK-FIXES: #include <format>
+#include <inttypes.h>
 
 namespace absl
 {
@@ -103,13 +108,6 @@ std::string StrFormat_macros() {
   // CHECK-MESSAGES: [[@LINE-1]]:13: warning: use 'std::format' instead of 'StrFormat' [modernize-use-std-format]
   // CHECK-FIXES: std::format("Hello {}", 42);
 
-  // The format string is replaced even though it comes from a macro, this
-  // behaviour is required so that that <inttypes.h> macros are replaced.
-#define FORMAT_STRING "Hello %s"
-  auto s2 = absl::StrFormat(FORMAT_STRING, 42);
-  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: use 'std::format' instead of 'StrFormat' [modernize-use-std-format]
-  // CHECK-FIXES: std::format("Hello {}", 42);
-
   // Arguments that are macros aren't replaced with their value, even if they are rearranged.
 #define VALUE 3.14159265358979323846
 #define WIDTH 10
@@ -117,4 +115,41 @@ std::string StrFormat_macros() {
   auto s3 = absl::StrFormat("Hello %*.*f", WIDTH, PRECISION, VALUE);
   // CHECK-MESSAGES: [[@LINE-1]]:13: warning: use 'std::format' instead of 'StrFormat' [modernize-use-std-format]
   // CHECK-FIXES: std::format("Hello {:{}.{}f}", VALUE, WIDTH, PRECISION);
+
+  const uint64_t u64 = 42;
+  const uint32_t u32 = 32;
+  std::string s;
+
+  auto s4 = absl::StrFormat("Replaceable macro at end %" PRIu64, u64);
+  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: use 'std::format' instead of 'StrFormat' [modernize-use-std-format]
+  // CHECK-FIXES: std::format("Replaceable macro at end {}", u64);
+
+  auto s5 = absl::StrFormat("Replaceable macros in middle %" PRIu64 " %" PRIu32 "\n", u64, u32);
+  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: use 'std::format' instead of 'StrFormat' [modernize-use-std-format]
+  // CHECK-FIXES: std::format("Replaceable macros in middle {} {}\n", u64, u32);
+
+// These need PRI and __PRI prefixes so that the check get as far as looking for
+// where the macro comes from.
+#define PRI_FMT_MACRO "%s"
+#define __PRI_FMT_MACRO "%s"
+
+  auto s6 = absl::StrFormat("Unreplaceable macro at end " PRI_FMT_MACRO, s.c_str());
+  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: unable to use 'std::format' instead of 'StrFormat' because format string contains unreplaceable macro 'PRI_FMT_MACRO' [modernize-use-std-format]
+
+  auto s7 = absl::StrFormat(__PRI_FMT_MACRO " Unreplaceable macro at beginning", s);
+  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: unable to use 'std::format' instead of 'StrFormat' because format string contains unreplaceable macro '__PRI_FMT_MACRO' [modernize-use-std-format]
+
+  auto s8 = absl::StrFormat("Unreplacemable macro " PRI_FMT_MACRO " in the middle", s);
+  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: unable to use 'std::format' instead of 'StrFormat' because format string contains unreplaceable macro 'PRI_FMT_MACRO' [modernize-use-std-format]
+
+  auto s9 = absl::StrFormat("First macro is replaceable %" PRIu64 " but second one is not " __PRI_FMT_MACRO, u64, s);
+  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: unable to use 'std::format' instead of 'StrFormat' because format string contains unreplaceable macro '__PRI_FMT_MACRO' [modernize-use-std-format]
+
+  // Needs a PRI prefix so that we get as far as looking for where the macro comes from
+  auto s10 = absl::StrFormat(" macro from command line " PRI_CMDLINE_MACRO, s);
+  // CHECK-MESSAGES: [[@LINE-1]]:14: warning: unable to use 'std::format' instead of 'StrFormat' because format string contains unreplaceable macro 'PRI_CMDLINE_MACRO' [modernize-use-std-format]
+
+  // Needs a __PRI prefix so that we get as far as looking for where the macro comes from
+  auto s11 = absl::StrFormat(" macro from command line " __PRI_CMDLINE_MACRO, s);
+  // CHECK-MESSAGES: [[@LINE-1]]:14: warning: unable to use 'std::format' instead of 'StrFormat' because format string contains unreplaceable macro '__PRI_CMDLINE_MACRO' [modernize-use-std-format]
 }
diff --git a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print.cpp b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print.cpp
index da1a18782c9be..fc953470209e2 100644
--- a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print.cpp
+++ b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print.cpp
@@ -1,11 +1,15 @@
 // RUN: %check_clang_tidy -check-suffixes=,STRICT \
 // RUN:   -std=c++23 %s modernize-use-std-print %t -- \
 // RUN:   -config="{CheckOptions: {StrictMode: true}}" \
-// RUN:   -- -isystem %clang_tidy_headers -fexceptions
+// RUN:   -- -isystem %clang_tidy_headers -fexceptions \
+// RUN:      -DPRI_CMDLINE_MACRO="\"%s\"" \
+// RUN:      -D__PRI_CMDLINE_MACRO="\"%s\""
 // RUN: %check_clang_tidy -check-suffixes=,NOTSTRICT \
 // RUN:   -std=c++23 %s modernize-use-std-print %t -- \
 // RUN:   -config="{CheckOptions: {StrictMode: false}}" \
-// RUN:   -- -isystem %clang_tidy_headers -fexceptions
+// RUN:   -- -isystem %clang_tidy_headers -fexceptions \
+// RUN:      -DPRI_CMDLINE_MACRO="\"%s\"" \
+// RUN:      -D__PRI_CMDLINE_MACRO="\"%s\""
 #include <cstddef>
 #include <cstdint>
 #include <cstdio>
@@ -1571,3 +1575,42 @@ void p(S s1, S *s2)
   // CHECK-MESSAGES: [[@LINE-1]]:3: warning: use 'std::print' instead of 'printf' [modernize-use-std-print]
   // CHECK-FIXES: std::print("Not std::string {} {}", s1.data(), s2->data());
 }
+
+// These need PRI and __PRI prefixes so that the check gets as far as looking
+// for where the macro comes from.
+#define PRI_FMT_MACRO "%s"
+#define __PRI_FMT_MACRO "%s"
+
+void macro_expansion(const char *s)
+{
+  const uint64_t u64 = 42;
+  const uint32_t u32 = 32;
+
+  printf("Replaceable macro at end %" PRIu64, u64);
+  // CHECK-MESSAGES: [[@LINE-1]]:3: warning: use 'std::print' instead of 'printf' [modernize-use-std-print]
+  // CHECK-FIXES: std::print("Replaceable macro at end {}", u64);
+
+  printf("Replaceable macros in middle %" PRIu64 " %" PRIu32 "...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 6, 2024

@llvm/pr-subscribers-clang-tidy

Author: Mike Crowe (mikecrowe)

Changes

Expanding all macros in the printf/absl::StrFormat format string before conversion could easily break code if those macros are expended to change their definition between builds. It's important for this check to expand the <inttypes.h> PRI macros though, so let's ensure that the presence of any other macros in the format string causes the check to emit a warning and not perform any conversion.


Patch is 21.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97911.diff

11 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.cpp (+4-3)
  • (modified) clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.h (+1)
  • (modified) clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.cpp (+3-1)
  • (modified) clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.h (+1)
  • (modified) clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp (+47-5)
  • (modified) clang-tools-extra/clang-tidy/utils/FormatStringConverter.h (+5-1)
  • (modified) clang-tools-extra/docs/ReleaseNotes.rst (+2-1)
  • (modified) clang-tools-extra/docs/clang-tidy/checks/modernize/use-std-print.rst (+13-9)
  • (modified) clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/inttypes.h (+16-10)
  • (modified) clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp (+44-9)
  • (modified) clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print.cpp (+45-2)
diff --git a/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.cpp b/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.cpp
index d082faa786b375..7e8cbd40fe07ae 100644
--- a/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.cpp
@@ -44,6 +44,7 @@ void UseStdFormatCheck::registerPPCallbacks(const SourceManager &SM,
                                             Preprocessor *PP,
                                             Preprocessor *ModuleExpanderPP) {
   IncludeInserter.registerPreprocessor(PP);
+  this->PP = PP;
 }
 
 void UseStdFormatCheck::registerMatchers(MatchFinder *Finder) {
@@ -78,9 +79,9 @@ void UseStdFormatCheck::check(const MatchFinder::MatchResult &Result) {
 
   utils::FormatStringConverter::Configuration ConverterConfig;
   ConverterConfig.StrictMode = StrictMode;
-  utils::FormatStringConverter Converter(Result.Context, StrFormat,
-                                         FormatArgOffset, ConverterConfig,
-                                         getLangOpts());
+  utils::FormatStringConverter Converter(
+      Result.Context, StrFormat, FormatArgOffset, ConverterConfig,
+      getLangOpts(), *Result.SourceManager, *PP);
   const Expr *StrFormatCall = StrFormat->getCallee();
   if (!Converter.canApply()) {
     diag(StrFormat->getBeginLoc(),
diff --git a/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.h b/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.h
index b59a4708c6e4bc..9ac2240212ebf6 100644
--- a/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.h
+++ b/clang-tools-extra/clang-tidy/modernize/UseStdFormatCheck.h
@@ -44,6 +44,7 @@ class UseStdFormatCheck : public ClangTidyCheck {
   StringRef ReplacementFormatFunction;
   utils::IncludeInserter IncludeInserter;
   std::optional<StringRef> MaybeHeaderToInclude;
+  Preprocessor *PP = nullptr;
 };
 
 } // namespace clang::tidy::modernize
diff --git a/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.cpp b/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.cpp
index 1ea170c3cd3106..69136c10d927b2 100644
--- a/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.cpp
@@ -68,6 +68,7 @@ void UseStdPrintCheck::registerPPCallbacks(const SourceManager &SM,
                                            Preprocessor *PP,
                                            Preprocessor *ModuleExpanderPP) {
   IncludeInserter.registerPreprocessor(PP);
+  this->PP = PP;
 }
 
 static clang::ast_matchers::StatementMatcher
@@ -137,7 +138,8 @@ void UseStdPrintCheck::check(const MatchFinder::MatchResult &Result) {
   ConverterConfig.StrictMode = StrictMode;
   ConverterConfig.AllowTrailingNewlineRemoval = true;
   utils::FormatStringConverter Converter(
-      Result.Context, Printf, FormatArgOffset, ConverterConfig, getLangOpts());
+      Result.Context, Printf, FormatArgOffset, ConverterConfig, getLangOpts(),
+      *Result.SourceManager, *PP);
   const Expr *PrintfCall = Printf->getCallee();
   const StringRef ReplacementFunction = Converter.usePrintNewlineFunction()
                                             ? ReplacementPrintlnFunction
diff --git a/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.h b/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.h
index 7a06cf38b4264f..995c740389e73b 100644
--- a/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.h
+++ b/clang-tools-extra/clang-tidy/modernize/UseStdPrintCheck.h
@@ -36,6 +36,7 @@ class UseStdPrintCheck : public ClangTidyCheck {
   }
 
 private:
+  Preprocessor *PP;
   bool StrictMode;
   std::vector<StringRef> PrintfLikeFunctions;
   std::vector<StringRef> FprintfLikeFunctions;
diff --git a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
index 33f3ea47df1e38..686c8fb71fae55 100644
--- a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
+++ b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
@@ -18,6 +18,7 @@
 #include "clang/ASTMatchers/ASTMatchFinder.h"
 #include "clang/Basic/LangOptions.h"
 #include "clang/Lex/Lexer.h"
+#include "clang/Lex/Preprocessor.h"
 #include "clang/Tooling/FixIt.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/Support/Debug.h"
@@ -195,11 +196,10 @@ static bool castMismatchedIntegerTypes(const CallExpr *Call, bool StrictMode) {
   return false;
 }
 
-FormatStringConverter::FormatStringConverter(ASTContext *ContextIn,
-                                             const CallExpr *Call,
-                                             unsigned FormatArgOffset,
-                                             const Configuration ConfigIn,
-                                             const LangOptions &LO)
+FormatStringConverter::FormatStringConverter(
+    ASTContext *ContextIn, const CallExpr *Call, unsigned FormatArgOffset,
+    const Configuration ConfigIn, const LangOptions &LO, SourceManager &SM,
+    Preprocessor &PP)
     : Context(ContextIn), Config(ConfigIn),
       CastMismatchedIntegerTypes(
           castMismatchedIntegerTypes(Call, ConfigIn.StrictMode)),
@@ -208,11 +208,22 @@ FormatStringConverter::FormatStringConverter(ASTContext *ContextIn,
   assert(ArgsOffset <= NumArgs);
   FormatExpr = llvm::dyn_cast<StringLiteral>(
       Args[FormatArgOffset]->IgnoreImplicitAsWritten());
+
   if (!FormatExpr || !FormatExpr->isOrdinary()) {
     // Function must have a narrow string literal as its first argument.
     conversionNotPossible("first argument is not a narrow string literal");
     return;
   }
+
+  if (const auto MaybeMacroName =
+          formatStringContainsUnreplaceableMacro(FormatExpr, SM, PP);
+      MaybeMacroName) {
+    conversionNotPossible(
+        ("format string contains unreplaceable macro '" + *MaybeMacroName + "'")
+            .str());
+    return;
+  }
+
   PrintfFormatString = FormatExpr->getString();
 
   // Assume that the output will be approximately the same size as the input,
@@ -230,6 +241,37 @@ FormatStringConverter::FormatStringConverter(ASTContext *ContextIn,
   finalizeFormatText();
 }
 
+std::optional<StringRef>
+FormatStringConverter::formatStringContainsUnreplaceableMacro(
+    const StringLiteral *FormatExpr, SourceManager &SM, Preprocessor &PP) {
+  for (auto I = FormatExpr->tokloc_begin(), E = FormatExpr->tokloc_end();
+       I != E; ++I) {
+    const SourceLocation &TokenLoc = *I;
+    if (TokenLoc.isMacroID()) {
+      const StringRef MacroName =
+          Lexer::getImmediateMacroName(TokenLoc, SM, PP.getLangOpts());
+
+      // glibc uses __PRI64_PREFIX and __PRIPTR_PREFIX to define the prefixes
+      // for types that change size so we must look for multiple prefixes.
+      if (!MacroName.starts_with("PRI") && !MacroName.starts_with("__PRI"))
+        return MacroName;
+
+      const SourceLocation TokenSpellingLoc = SM.getSpellingLoc(TokenLoc);
+      const OptionalFileEntryRef MaybeFileEntry =
+          SM.getFileEntryRefForID(SM.getFileID(TokenSpellingLoc));
+      if (!MaybeFileEntry)
+        return MacroName;
+
+      HeaderSearch &HS = PP.getHeaderSearchInfo();
+      // Check if the file is a system header
+      if (!isSystem(HS.getFileDirFlavor(*MaybeFileEntry)) ||
+          llvm::sys::path::filename(MaybeFileEntry->getName()) != "inttypes.h")
+        return MacroName;
+    }
+  }
+  return std::nullopt;
+}
+
 void FormatStringConverter::emitAlignment(const PrintfSpecifier &FS,
                                           std::string &FormatSpec) {
   ConversionSpecifier::Kind ArgKind = FS.getConversionSpecifier().getKind();
diff --git a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.h b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.h
index 1109a0b602262f..5d4b694647aad8 100644
--- a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.h
+++ b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.h
@@ -40,7 +40,8 @@ class FormatStringConverter
 
   FormatStringConverter(ASTContext *Context, const CallExpr *Call,
                         unsigned FormatArgOffset, Configuration Config,
-                        const LangOptions &LO);
+                        const LangOptions &LO, SourceManager &SM,
+                        Preprocessor &PP);
 
   bool canApply() const { return ConversionNotPossibleReason.empty(); }
   const std::string &conversionNotPossibleReason() const {
@@ -110,6 +111,9 @@ class FormatStringConverter
 
   void appendFormatText(StringRef Text);
   void finalizeFormatText();
+  static std::optional<StringRef>
+  formatStringContainsUnreplaceableMacro(const StringLiteral *FormatExpr,
+                                         SourceManager &SM, Preprocessor &PP);
   bool conversionNotPossible(std::string Reason) {
     ConversionNotPossibleReason = std::move(Reason);
     return false;
diff --git a/clang-tools-extra/docs/ReleaseNotes.rst b/clang-tools-extra/docs/ReleaseNotes.rst
index e570c8184f8b0a..36f8edbcaa73ec 100644
--- a/clang-tools-extra/docs/ReleaseNotes.rst
+++ b/clang-tools-extra/docs/ReleaseNotes.rst
@@ -396,7 +396,8 @@ Changes in existing checks
 - Improved :doc:`modernize-use-std-print
   <clang-tidy/checks/modernize/use-std-print>` check to not crash if the
   format string parameter of the function to be replaced is not of the
-  expected type.
+  expected type. Only macros starting with ``PRI`` and ``__PRI`` from
+  ``<inttypes.h>`` are now expanded in the format string.
 
 - Improved :doc:`modernize-use-using <clang-tidy/checks/modernize/use-using>`
   check by adding support for detection of typedefs declared on function level.
diff --git a/clang-tools-extra/docs/clang-tidy/checks/modernize/use-std-print.rst b/clang-tools-extra/docs/clang-tidy/checks/modernize/use-std-print.rst
index 79648a1104bca2..1e33d347f65a02 100644
--- a/clang-tools-extra/docs/clang-tidy/checks/modernize/use-std-print.rst
+++ b/clang-tools-extra/docs/clang-tidy/checks/modernize/use-std-print.rst
@@ -24,8 +24,15 @@ into:
   std::println(stderr, "The {} is {:3}", description, value);
 
 If the `ReplacementPrintFunction` or `ReplacementPrintlnFunction` options
-are left, or assigned to their default values then this check is only
-enabled with `-std=c++23` or later.
+are left at or set to their default values then this check is only enabled
+with `-std=c++23` or later.
+
+Macros starting with ``PRI`` and ``__PRI`` from `<inttypes.h>` are
+expanded, escaping is handled and adjacent strings are concatenated to form
+a single ``StringLiteral`` before the format string is converted. Use of
+any other macros in the format string will cause a warning message to be
+emitted and no conversion will be performed. The resultant converted format
+string will always be a single string literal.
 
 The check doesn't do a bad job, but it's not perfect. In particular:
 
@@ -34,13 +41,10 @@ The check doesn't do a bad job, but it's not perfect. In particular:
   possible.
 
 - At the point that the check runs, the AST contains a single
-  ``StringLiteral`` for the format string and any macro expansion, token
-  pasting, adjacent string literal concatenation and escaping has been
-  handled. Although it's possible for the check to automatically put the
-  escapes back, they may not be exactly as they were written (e.g.
-  ``"\x0a"`` will become ``"\n"`` and ``"ab" "cd"`` will become
-  ``"abcd"``.) This is helpful since it means that the ``PRIx`` macros from
-  ``<inttypes.h>`` are removed correctly.
+  ``StringLiteral`` for the format string where escapes have been expanded.
+  The check tries to put the escapes back, they may not be exactly as they
+  were written (e.g. ``"\x41\x0a"`` will become ``"A\n"`` and ``"ab" "cd"``
+  will become ``"abcd"``.)
 
 - It supports field widths, precision, positional arguments, leading zeros,
   leading ``+``, alignment and alternative forms.
diff --git a/clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/inttypes.h b/clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/inttypes.h
index 9dc7ae39b3a3f0..74437f405931b2 100644
--- a/clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/inttypes.h
+++ b/clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/inttypes.h
@@ -21,40 +21,46 @@ typedef __UINT32_TYPE__ uint32_t;
 typedef __UINT16_TYPE__ uint16_t;
 typedef __UINT8_TYPE__ uint8_t;
 
-#define PRIdMAX "lld"
-#define PRId64 "lld"
+#if __WORDSIZE == 64
+# define __PRI64_PREFIX	"l"
+#else
+# define __PRI64_PREFIX	"ll"
+#endif
+
+#define PRIdMAX __PRI64_PREFIX "d"
+#define PRId64 __PRI64_PREFIX "d"
 #define PRId32 "d"
 #define PRId16 "hd"
 #define PRId8  "hhd"
 
-#define PRIiMAX "lli"
-#define PRIi64 "lli"
+#define PRIiMAX __PRI64_PREFIX "i"
+#define PRIi64 __PRI64_PREFIX "i"
 #define PRIi32 "i"
 #define PRIi16 "hi"
 #define PRIi8  "hhi"
 
-#define PRIiFAST64 "lli"
+#define PRIiFAST64 __PRI64_PREFIX "i"
 #define PRIiFAST32 "i"
 #define PRIiFAST16 "hi"
 #define PRIiFAST8  "hhi"
 
-#define PRIiLEAST64 "lli"
+#define PRIiLEAST64 __PRI64_PREFIX "i"
 #define PRIiLEAST32 "i"
 #define PRIiLEAST16 "hi"
 #define PRIiLEAST8  "hhi"
 
-#define PRIuMAX "llu"
-#define PRIu64 "llu"
+#define PRIuMAX __PRI64_PREFIX "u"
+#define PRIu64 __PRI64_PREFIX "u"
 #define PRIu32 "u"
 #define PRIu16 "hu"
 #define PRIu8  "hhu"
 
-#define PRIuFAST64 "llu"
+#define PRIuFAST64 __PRI64_PREFIX "u"
 #define PRIuFAST32 "u"
 #define PRIuFAST16 "hu"
 #define PRIuFAST8  "hhu"
 
-#define PRIuLEAST64 "llu"
+#define PRIuLEAST64 __PRI64_PREFIX "u"
 #define PRIuLEAST32 "u"
 #define PRIuLEAST16 "hu"
 #define PRIuLEAST8  "hhu"
diff --git a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp
index e8dea1dce2c972..09e67c7397f008 100644
--- a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp
+++ b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp
@@ -1,13 +1,18 @@
 // RUN: %check_clang_tidy \
 // RUN:   -std=c++20 %s modernize-use-std-format %t -- \
 // RUN:   -config="{CheckOptions: {StrictMode: true}}" \
-// RUN:   -- -isystem %clang_tidy_headers
+// RUN:   -- -isystem %clang_tidy_headers \
+// RUN:      -DPRI_CMDLINE_MACRO="\"%s\"" \
+// RUN:      -D__PRI_CMDLINE_MACRO="\"%s\""
 // RUN: %check_clang_tidy \
 // RUN:   -std=c++20 %s modernize-use-std-format %t -- \
 // RUN:   -config="{CheckOptions: {StrictMode: false}}" \
-// RUN:   -- -isystem %clang_tidy_headers
+// RUN:   -- -isystem %clang_tidy_headers \
+// RUN:      -DPRI_CMDLINE_MACRO="\"%s\"" \
+// RUN:      -D__PRI_CMDLINE_MACRO="\"%s\""
 #include <string>
 // CHECK-FIXES: #include <format>
+#include <inttypes.h>
 
 namespace absl
 {
@@ -103,13 +108,6 @@ std::string StrFormat_macros() {
   // CHECK-MESSAGES: [[@LINE-1]]:13: warning: use 'std::format' instead of 'StrFormat' [modernize-use-std-format]
   // CHECK-FIXES: std::format("Hello {}", 42);
 
-  // The format string is replaced even though it comes from a macro, this
-  // behaviour is required so that that <inttypes.h> macros are replaced.
-#define FORMAT_STRING "Hello %s"
-  auto s2 = absl::StrFormat(FORMAT_STRING, 42);
-  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: use 'std::format' instead of 'StrFormat' [modernize-use-std-format]
-  // CHECK-FIXES: std::format("Hello {}", 42);
-
   // Arguments that are macros aren't replaced with their value, even if they are rearranged.
 #define VALUE 3.14159265358979323846
 #define WIDTH 10
@@ -117,4 +115,41 @@ std::string StrFormat_macros() {
   auto s3 = absl::StrFormat("Hello %*.*f", WIDTH, PRECISION, VALUE);
   // CHECK-MESSAGES: [[@LINE-1]]:13: warning: use 'std::format' instead of 'StrFormat' [modernize-use-std-format]
   // CHECK-FIXES: std::format("Hello {:{}.{}f}", VALUE, WIDTH, PRECISION);
+
+  const uint64_t u64 = 42;
+  const uint32_t u32 = 32;
+  std::string s;
+
+  auto s4 = absl::StrFormat("Replaceable macro at end %" PRIu64, u64);
+  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: use 'std::format' instead of 'StrFormat' [modernize-use-std-format]
+  // CHECK-FIXES: std::format("Replaceable macro at end {}", u64);
+
+  auto s5 = absl::StrFormat("Replaceable macros in middle %" PRIu64 " %" PRIu32 "\n", u64, u32);
+  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: use 'std::format' instead of 'StrFormat' [modernize-use-std-format]
+  // CHECK-FIXES: std::format("Replaceable macros in middle {} {}\n", u64, u32);
+
+// These need PRI and __PRI prefixes so that the check get as far as looking for
+// where the macro comes from.
+#define PRI_FMT_MACRO "%s"
+#define __PRI_FMT_MACRO "%s"
+
+  auto s6 = absl::StrFormat("Unreplaceable macro at end " PRI_FMT_MACRO, s.c_str());
+  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: unable to use 'std::format' instead of 'StrFormat' because format string contains unreplaceable macro 'PRI_FMT_MACRO' [modernize-use-std-format]
+
+  auto s7 = absl::StrFormat(__PRI_FMT_MACRO " Unreplaceable macro at beginning", s);
+  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: unable to use 'std::format' instead of 'StrFormat' because format string contains unreplaceable macro '__PRI_FMT_MACRO' [modernize-use-std-format]
+
+  auto s8 = absl::StrFormat("Unreplacemable macro " PRI_FMT_MACRO " in the middle", s);
+  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: unable to use 'std::format' instead of 'StrFormat' because format string contains unreplaceable macro 'PRI_FMT_MACRO' [modernize-use-std-format]
+
+  auto s9 = absl::StrFormat("First macro is replaceable %" PRIu64 " but second one is not " __PRI_FMT_MACRO, u64, s);
+  // CHECK-MESSAGES: [[@LINE-1]]:13: warning: unable to use 'std::format' instead of 'StrFormat' because format string contains unreplaceable macro '__PRI_FMT_MACRO' [modernize-use-std-format]
+
+  // Needs a PRI prefix so that we get as far as looking for where the macro comes from
+  auto s10 = absl::StrFormat(" macro from command line " PRI_CMDLINE_MACRO, s);
+  // CHECK-MESSAGES: [[@LINE-1]]:14: warning: unable to use 'std::format' instead of 'StrFormat' because format string contains unreplaceable macro 'PRI_CMDLINE_MACRO' [modernize-use-std-format]
+
+  // Needs a __PRI prefix so that we get as far as looking for where the macro comes from
+  auto s11 = absl::StrFormat(" macro from command line " __PRI_CMDLINE_MACRO, s);
+  // CHECK-MESSAGES: [[@LINE-1]]:14: warning: unable to use 'std::format' instead of 'StrFormat' because format string contains unreplaceable macro '__PRI_CMDLINE_MACRO' [modernize-use-std-format]
 }
diff --git a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print.cpp b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print.cpp
index da1a18782c9bed..fc953470209e2d 100644
--- a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print.cpp
+++ b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print.cpp
@@ -1,11 +1,15 @@
 // RUN: %check_clang_tidy -check-suffixes=,STRICT \
 // RUN:   -std=c++23 %s modernize-use-std-print %t -- \
 // RUN:   -config="{CheckOptions: {StrictMode: true}}" \
-// RUN:   -- -isystem %clang_tidy_headers -fexceptions
+// RUN:   -- -isystem %clang_tidy_headers -fexceptions \
+// RUN:      -DPRI_CMDLINE_MACRO="\"%s\"" \
+// RUN:      -D__PRI_CMDLINE_MACRO="\"%s\""
 // RUN: %check_clang_tidy -check-suffixes=,NOTSTRICT \
 // RUN:   -std=c++23 %s modernize-use-std-print %t -- \
 // RUN:   -config="{CheckOptions: {StrictMode: false}}" \
-// RUN:   -- -isystem %clang_tidy_headers -fexceptions
+// RUN:   -- -isystem %clang_tidy_headers -fexceptions \
+// RUN:      -DPRI_CMDLINE_MACRO="\"%s\"" \
+// RUN:      -D__PRI_CMDLINE_MACRO="\"%s\""
 #include <cstddef>
 #include <cstdint>
 #include <cstdio>
@@ -1571,3 +1575,42 @@ void p(S s1, S *s2)
   // CHECK-MESSAGES: [[@LINE-1]]:3: warning: use 'std::print' instead of 'printf' [modernize-use-std-print]
   // CHECK-FIXES: std::print("Not std::string {} {}", s1.data(), s2->data());
 }
+
+// These need PRI and __PRI prefixes so that the check gets as far as looking
+// for where the macro comes from.
+#define PRI_FMT_MACRO "%s"
+#define __PRI_FMT_MACRO "%s"
+
+void macro_expansion(const char *s)
+{
+  const uint64_t u64 = 42;
+  const uint32_t u32 = 32;
+
+  printf("Replaceable macro at end %" PRIu64, u64);
+  // CHECK-MESSAGES: [[@LINE-1]]:3: warning: use 'std::print' instead of 'printf' [modernize-use-std-print]
+  // CHECK-FIXES: std::print("Replaceable macro at end {}", u64);
+
+  printf("Replaceable macros in middle %...
[truncated]

@mikecrowe
Copy link
Contributor Author

I'm not particularly happy with the FormatStringConverter constructor now taking seven parameters. The risk of them getting muddled isn't huge though because they are all of different types. Should I perhaps put them all into the FormatStringConverter::Configuration struct or perhaps a separate struct in the absence of named parameters? Even if I leave them as seven parameters, is there a preferred order for them? Any recommendations gratefully received.

I considered adding a configuration option to list other macros that should be permitted in the format string. I will investigate doing so if this change is acceptable.

OpenBSD and mingw32 seem to just use PRI prefixes in their inttypes.h. I don't know whether Microsoft and MacOS do anything like glibc does which required the special __PRI handling.

@mikecrowe
Copy link
Contributor Author

It looks like the tests are failing on Windows because the %s in the command line

// RUN: %check_clang_tidy \
// RUN:   -std=c++20 %s modernize-use-std-format %t -- \
// RUN:   -config="{CheckOptions: {StrictMode: true}}" \
// RUN:   -- -isystem %clang_tidy_headers \
// RUN:      -DPRI_CMDLINE_MACRO="\"%s\"" \
// RUN:      -D__PRI_CMDLINE_MACRO="\"%s\""

is being replaced by the filename for the LIT test:

| Running ['clang-tidy', 'C:\\ws\\src\\build\\tools\\clang\\tools\\extra\\test\\clang-tidy\\checkers\\modernize\\Output\\use-std-format.cpp.tmp.cpp', '-fix', '--checks=-*,modernize-use-std-format', '-config={CheckOptions: {StrictMode: true}}', '--', '-isystem', 'C:/ws/src/clang-tools-extra/test/../test\\clang-tidy\\checkers\\Inputs\\Headers', '-DPRI_CMDLINE_MACRO="C:\\ws\\src\\clang-tools-extra\\test\\clang-tidy\\checkers\\modernize\\use-std-format.cpp"', '-D__PRI_CMDLINE_MACRO="C:\\ws\\src\\clang-tools-extra\\test\\clang-tidy\\checkers\\modernize\\use-std-format.cpp"', '-std=c++20', '-nostdinc++']...

I can believe that this is being done by TestRunner.py (or some other part of llvm-lit) since that's expected for the first %s on the command line, but the surprise is that it doesn't also do this on Linux. Nevertheless, I'll modify the tests to avoid using % at all in the macros on the command line.

Copy link
Member

@PiotrZSL PiotrZSL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks fine for me.

if (!MacroName.starts_with("PRI") && !MacroName.starts_with("__PRI"))
return MacroName;

const SourceLocation TokenSpellingLoc = SM.getSpellingLoc(TokenLoc);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add checking here if location is valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that I should check whether the SourceLocation returned from SourceManager::getSpellingLoc() is valid?

Looking at the SourceLocation documentation and other checks that call SourceManager::getSpellingLoc() I was unable to spot how to tell whether the location returned is valid. All the other callers I looked at just went on to use the returned location without any checking.

Can I arrange for an invalid location to be returned from the lit test somehow to test this case?

@mikecrowe
Copy link
Contributor Author

Unfortunately, walking the format string looking for macros also seems to match a macro that encloses the whole function invocation. This results in the check failing to work on expressions inside Catch2 REQUIRE macros for example.

My new test case (that currently fails) is:

#define SURROUND(x) x
  SURROUND(printf("Surrounding macro %d\n", 42));
  // CHECK-MESSAGES: [[@LINE-1]]:3: warning: use 'std::println' instead of 'printf' [modernize-use-std-print]
  // CHECK-FIXES: std::print("Surrounding macro {}", 42);

I'll try to fix this, but any advice is gratefully received.

@mikecrowe
Copy link
Contributor Author

Unfortunately, walking the format string looking for macros also seems to match a macro that encloses the whole function invocation. This results in the check failing to work on expressions inside Catch2 REQUIRE macros for example.

My new test case (that currently fails) is:

#define SURROUND(x) x
  SURROUND(printf("Surrounding macro %d\n", 42));
  // CHECK-MESSAGES: [[@LINE-1]]:3: warning: use 'std::println' instead of 'printf' [modernize-use-std-print]
  // CHECK-FIXES: std::print("Surrounding macro {}", 42);

I'll try to fix this, but any advice is gratefully received.

I've pushed a new commit that works for the Lit test cases that I've added by getting the macro name for the function call and permitting that macro for the format string too. I can't say that I'm particularly happy with it, and there may be corner cases such as the same macro being intentionally used in both places, but it's the best I can currently work out how to do.

@@ -230,6 +241,50 @@ FormatStringConverter::FormatStringConverter(ASTContext *ContextIn,
finalizeFormatText();
}

std::optional<StringRef>
FormatStringConverter::formatStringContainsUnreplaceableMacro(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting the macro name for the function call and permitting that macro for the format string too

such as the same macro being intentionally used in both places

It should be possible to detect this case comparing the FileID of the call's location and the format expression's location. If they differ, then they are different arguments -> different macros. Although, I haven't checked that yet.

@mikecrowe
Copy link
Contributor Author

@5chmidti wrote:

It should be possible to detect this case comparing the FileID of the call's location and the format expression's location. If they differ, then they are different arguments -> different macros. Although, I haven't checked that yet.

I had thought so too but I was unable to find any SourceLocation or Lexer functions that returned what I needed. I've pushed an extra commit to the branch for this PR that contains an extra LIT test and some debug output. The test contains:

SURROUND_ALL(printf(SURROUND_FORMAT_PARTIAL("%" FMT) "\n", "42"));

with the hope that I can detect that SURROUND_ALL is applied to both the printf and the "\n" but different macros are applied to the other tokens that make up the format string.

The debug output (with the filenames shortened for readability) is:

getImmediateExpansionRange for 'BeginCallLoc': <use-std-print-macro.cpp.tmp.cpp:16:3 <Spelling=line:10:25>>
getExpansionRange for 'BeginCallLoc': <use-std-print-macro.cpp.tmp.cpp:16:3, col:67>
getTopMacroCaller for 'BeginCallLoc': use-std-print-macro.cpp.tmp.cpp:16:16
getImmediateMacroCallerLoc for 'BeginCallLoc': use-std-print-macro.cpp.tmp.cpp:16:16
getImmediateSpellingLoc for 'BeginCallLoc': use-std-print-macro.cpp.tmp.cpp:16:16
getFileLoc for 'BeginCallLoc': use-std-print-macro.cpp.tmp.cpp:16:16

getImmediateExpansionRange for 'SURROUND_FORMAT_PARTIAL': <use-std-print-macro.cpp.tmp.cpp:16:3 <Spelling=line:10:25>>
getExpansionRange for 'SURROUND_FORMAT_PARTIAL': <use-std-print-macro.cpp.tmp.cpp:16:3, col:67>
getTopMacroCaller for 'SURROUND_FORMAT_PARTIAL': use-std-print-macro.cpp.tmp.cpp:16:47
getImmediateMacroCallerLoc for 'SURROUND_FORMAT_PARTIAL': use-std-print-macro.cpp.tmp.cpp:16:23 <Spelling=use-std-print-macro.cpp.tmp.cpp:16:47>
getImmediateSpellingLoc for 'SURROUND_FORMAT_PARTIAL': use-std-print-macro.cpp.tmp.cpp:16:23 <Spelling=use-std-print-macro.cpp.tmp.cpp:16:47>
getFileLoc for 'SURROUND_FORMAT_PARTIAL': use-std-print-macro.cpp.tmp.cpp:16:47

getImmediateExpansionRange for 'FMT': <use-std-print-macro.cpp.tmp.cpp:16:3 <Spelling=line:10:25>>
getExpansionRange for 'FMT': <use-std-print-macro.cpp.tmp.cpp:16:3, col:67>
getTopMacroCaller for 'FMT': use-std-print-macro.cpp.tmp.cpp:16:51 <Spelling=use-std-print-macro.cpp.tmp.cpp:12:13>
getImmediateMacroCallerLoc for 'FMT': use-std-print-macro.cpp.tmp.cpp:16:23 <Spelling=use-std-print-macro.cpp.tmp.cpp:12:13>
getImmediateSpellingLoc for 'FMT': use-std-print-macro.cpp.tmp.cpp:16:23 <Spelling=use-std-print-macro.cpp.tmp.cpp:12:13>
getFileLoc for 'FMT': use-std-print-macro.cpp.tmp.cpp:16:51

getImmediateExpansionRange for 'SURROUND_ALL': <use-std-print-macro.cpp.tmp.cpp:16:3 <Spelling=line:10:25>>
getExpansionRange for 'SURROUND_ALL': <use-std-print-macro.cpp.tmp.cpp:16:3, col:67>
getTopMacroCaller for 'SURROUND_ALL': use-std-print-macro.cpp.tmp.cpp:16:56
getImmediateMacroCallerLoc for 'SURROUND_ALL': use-std-print-macro.cpp.tmp.cpp:16:56
getImmediateSpellingLoc for 'SURROUND_ALL': use-std-print-macro.cpp.tmp.cpp:16:56
getFileLoc for 'SURROUND_ALL': use-std-print-macro.cpp.tmp.cpp:16:56

Although the last token inside SURROUND_ALL has a few functions which return the same location as for BeginCallLoc, these functions also return that location for the tokens inside other macros too. :(

The change also captures the FileID for BeginCallLoc and compares it against each TokenLoc but never gets a match.

Or perhaps I've misinterpreted what you meant. If so, please can you provide more details?

Copy link
Contributor

@5chmidti 5chmidti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be 1 week+ before I am back at my main machine to take a better look at the macro question. Maybe this can just be landed, without that distinction.

@mikecrowe mikecrowe force-pushed the mac/fix-use-std-print-macro-expansion branch from 3028e49 to 9b514e7 Compare July 27, 2024 15:51
@mikecrowe
Copy link
Contributor Author

mikecrowe commented Jul 27, 2024

@5chmidti wrote:

It'll be 1 week+ before I am back at my main machine to take a better look at the macro question. Maybe this can just be landed, without that distinction.

Thanks for letting me know and thanks for the review.

I've pushed a new version to this PR with the typo you spotted fixed and removing the commit containing the debug output for exploring the macro detection (the version with the extra debug is at https://github.com/mikecrowe/clang-tidy-fmt/commits/mac/fix-use-std-print-macro-expansion-investigation/ if you still need it).

I'll let you and @PiotrZSL decide whether you want to land this PR as it is since I don't have permission to do so myself. I've deliberately not been rebasing this PR on top of origin/main to make review easier. I notice that it now has conflicts on the release notes. If you are happy with this change as it is then I can squash and rebase and fix the conflict to make it easier to land if you want me to.

Copy link
Contributor

@5chmidti 5chmidti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from my side

@mikecrowe mikecrowe force-pushed the mac/fix-use-std-print-macro-expansion branch from 9b514e7 to f7be65b Compare July 28, 2024 15:56
@mikecrowe
Copy link
Contributor Author

@5chmidti wrote:

Looks good from my side

Thanks. I've squashed and rebased. Since 19 has been branched now and it includes modernize-use-std-format I've mentioned the improvement as being for both modernize-use-std-print and modernize-use-std-format in the release notes whilst I was resolving the conflict.

@mikecrowe mikecrowe force-pushed the mac/fix-use-std-print-macro-expansion branch from f7be65b to 71dedef Compare July 28, 2024 16:09
…mat/print

Expanding all macros in the printf/absl::StrFormat format string before
conversion could easily break code if those macros are expended to
change their definition between builds. It's important for this check to
expand the <inttypes.h> PRI macros though, so let's ensure that the
presence of any other macros in the format string causes the check to
emit a warning and not perform any conversion.
@mikecrowe mikecrowe force-pushed the mac/fix-use-std-print-macro-expansion branch from 71dedef to 2dd0902 Compare October 5, 2024 18:03
@mikecrowe
Copy link
Contributor Author

I've fixed the release notes conflicts and slightly tweaked the documentation a little. @5chmidti or @PiotrZSL, please land this if you are happy that it's a net improvement on the existing code. I can then try to work out how to improve it further. Thanks.

@5chmidti
Copy link
Contributor

5chmidti commented Oct 8, 2024

Yep, thanks. Sorry for the delay

@5chmidti 5chmidti merged commit a199fb1 into llvm:main Oct 8, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants