-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading Pointer bytes as Integers #547
Comments
Which pointers? |
I failed to clarify that. It was referring to the consteval AM, where allocations that exist outside of the particular constant evaluation (what I call symbolic pointers) can't be assigned an address. |
Const-eval can't assign an address to any allocation, "inside" or "outside". (Not sure what you mean with that distinction.) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Anyway that sub-discussion seems off-topic here, please move it to Zulip. And please update the issue description to clarify that "certain pointers" refers to const-eval. |
I suppose the third alternative that should be addressed is that the read exposes the pointer bytes, but I don't like that suggestion (and I recall few people did), as it means that reads can result in a side effect, and such reads as an integer type can never be elided. Is there any other alternative I'm missing? |
Yeah I definitely don't like that suggestion, it pessimizes optimization too much. It is worth mentioning that that third alternative is basically what PNVI-ae-udi mandates for C. I am curious if compilers will actually implement that, though. |
We could characterize this as a "unsupported in const-eval" error rather than a UB error. (Internally in rustc this is already what we do, That would be similar to how |
There's another aspect of provenance that we haven't officially decided yet and that is implicitly excluded by the current wording in rust-lang/reference#1664: do the individual bytes in a pointer "remember" where in the pointer they are, and have to be put back in the same order? Some formal models require this, and if we ever allow "taking apart" the bytes of a pointer in const-eval we'll also have to require this, but for runtime semantics we could decide either way. The one example of code that I am aware of that breaks this requirement is XOR linked lists, which can be implemented in the semantics sketched in MiniRust right now but can't be implemented if bytes with provenance remember their position in the pointer. That's not exactly realistic code, but it is somewhat satisfying that (on architectures where pointers have at least 2 bytes), XOR linked lists can be implemented. The main upside of requiring the same bytes in the same order is that it rules out pointer crimes like XOR linked lists so if there's some unexpected interactions there, we'd not be affected. I am not aware of an optimization that would benefit from this UB, it's mostly a case of "ruling out some rather cursed programs to avoid locking ourselves into an unexpected corner". In some sense the model becomes a bit simpler since we can just say, pointer bytes must be put back together in the same order they started out as before they can be treated as a pointer again, but the actual op.sem would become more complicated because of the extra bookkeeping is required to enforce this. |
This came up in rust-lang/reference#1664. I wanted to ask what T-opsem thinks about the behaviour of reading pointer bytes as integer types (or as
char
/bool
/etc.).As far as I can tell, there are two "sensible" behaviours, given that integers themselves do no carry provenance:
Given provenance monotonicity, which would be violated by the decoding error, it seems like the best option is that the fragments are ignored. Is there anything missed here? If not, can we get a formal sign off on this behaviour.
Note that I'm only considering the runtime behaviour, which can be a point against adopting the behaviour. Given that it's impossible to get the address of certain pointers in const-eval, it does need to be undefined behaviour (or otherwise an error) to read pointer bytes (to at least symbolic allocations) as integer types.
The text was updated successfully, but these errors were encountered: