[clang] Further improvements to no_preserve_cheri_tags analysis #652

arichardson · 2022-10-06T22:11:42Z

This adds the no_preserve_cheri_tags attribute in more cases. The analysis could still be improved but I think there is probably not much to gain from that.

The one somewhat important case that was not handled in the previous PR is that we now also look through member expressions.

DebugLocEntry assumes that it either contains 1 item that has no fragment or many items that all have fragments (see the assert in addValues). When EXPENSIVE_CHECKS is enabled, _GLIBCXX_DEBUG is defined. On a few machines I've checked, this causes std::sort to call the comparator even if there is only 1 item to sort. Perhaps to check that it is implemented properly ordering wise, I didn't find out exactly why. operator< for a DbgValueLoc will crash if this happens because the optional Fragment is empty. Compiler/linker/optimisation level seems to make this happen or not. So I've seen this happen on x86 Ubuntu but the buildbot for release EXPENSIVE_CHECKS did not have this issue. Add an explicit check whether we have 1 item. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D130156 (cherry picked from commit a0ccba5)

These tests highlight some places where we can easily add the no_preserve_tags attribute to allow inlining small copies.

This allows inlining of structure assignments for structs that are at least capability size but do not contain any capabilities (e.g. `struct { long a; long b; }`). We can also set the attribute for all trivial auto var-init cases since those patterns never contain valid capabilities. Due to C's effective type rules, we have to be careful when setting the attribute and only perform the type-base tag-preservation analysis if we know the effective type. For example, marking a memcpy() to/from `long*` as not tag-preserving could result in tag stripping for code that uses type casts. Such code is correct even under strict aliasing rules since the first store to a memory location determines the type. Example from #506: ``` void *malloc(__SIZE_TYPE__); void *memcpy(void *, const void *, __SIZE_TYPE__); void foo(long **p, long **q) { *p = malloc(32); *q = malloc(32); (*p)[0] = 1; (*p)[1] = 2; *(void (**)(long **, long **))(*p + 2) = &foo; memcpy(*q, *p, 32); } ``` Despite the memcpy() argument being a long* (and therefore intuitively not tag preserving), we can't add the attribute since we don't actually know the type of the underlying object (malloc creates an allocated with no declared type). From C99: ``` The effective type of an object for an access to its stored value is the declared type of the object, if any (footnote 75: Allocated objects have no declared type). If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access. ``` There is another important caveat: we have to conservatively assume that the copy affects adjacent data (e.g. C++ subclass fields) that could hold capabilities if we don't know the copy size. If the copy size is <= sizeof(T), we can mark copies as non-tag-preserving since it cannot affect trailing fields (even if we are actually copying a subclass). We are also conservative if the structure contains an array of type ((un)signed) char or std::byte since those are often used to store arbitrary data (including capabilities). We could make this check more strict and require the array to be capability aligned, but that could be done as a follow-up change.

If we can see a non-pointer VarDecl, we know that the effective type that is being copied to/from matches the type of the VarDecl. Previously the following code: `int buf[16]; __builtin_memmove(cap, buf, sizeof(*cap));` didn't set the no_preserve_tags attribute, but now we do. There are a few more cases related to member expressions where we could add the attribute but don't yet. For example, for &foo->member if we can see the definition of foo and the entire struct does not contain capabilities. See the no-tag-copy-member-expr.cpp test for more examples and rationale. # Conflicts: # clang/lib/CodeGen/CodeGenTypes.cpp

If we have an expression such as __builtin_memcpy(buf, &s.not_a_cap, len); and we can see the declaration for s and the entire VarDecl for s does not contain tags and len does not extend beyond the end of s, then we can set no_preserve_tags.

If we see something like extern struct NoCaps unsized_array[]; we can still assume that we know the effective type since global variables have a defined type. The only exception here are arrays of type (unsigned) char since the C standard allows those to alias any other type.

Looking at the commit history the only test that triggers this is specific to the MS C++ ABI, so I have not been able to create a testcase.

This is unlikely to have an effect on the resulting codegen since most of those memcpy's are so small that they will be optimized away before hitting the backend, but I found this while making the PreserveTags argument mandatory for all calls to CreateMemCpy().

This ensures that we don't accidentally regress code generation when merging from upstream.

arichardson requested a review from jrtc27 October 6, 2022 22:11

DavidSpickett and others added 13 commits October 7, 2022 10:36

Add tests for clang setting the no_preserve_cheri_tags attribute

f376284

These tests highlight some places where we can easily add the no_preserve_tags attribute to allow inlining small copies.

Memory copies of less than cap size do not need to preserve tags

26fe50a

Memory copies from string literals do not need to preserve tags

aa535e2

[CGAtomic] Set PreserveCheriTags::Unnecessary for integer memcpy

e84966c

[CHERI] No need to preserve tags when copying from NULL constants

b014086

Looking at the commit history the only test that triggers this is specific to the MS C++ ABI, so I have not been able to create a testcase.

[CHERI] Baseline test for tag preservation on coerced arguments

f3d870d

[CHERI][clang] Require the preserve tags argument

f72039e

This ensures that we don't accidentally regress code generation when merging from upstream.

arichardson force-pushed the no-preserve-tags-clang-extended branch from e2ff40c to f72039e Compare October 7, 2022 10:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[clang] Further improvements to no_preserve_cheri_tags analysis #652

[clang] Further improvements to no_preserve_cheri_tags analysis #652

arichardson commented Oct 6, 2022

[clang] Further improvements to no_preserve_cheri_tags analysis #652

Are you sure you want to change the base?

[clang] Further improvements to no_preserve_cheri_tags analysis #652

Conversation

arichardson commented Oct 6, 2022