-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Language vs. implementation threat models and implications for TypeId collision resistance #129030
Comments
In #129016, we ask the concrete question of what the soundness expectations are for incremental compilation. It sounds like this issue tries to do the same but for at least half a dozen complicated discussions all at once. I find this much easier to discuss in concrete cases than in the super-abstract. Security is a trade-off, and specifically for TypeId if there are no noticeable downsides to achieving better collision resistance I see no reason not to do it. For this issue here, I am not entirely sure what the question is. "T-lang, please define a threat model" is too open-ended to be a useful question to nominate -- when exploring wide-open design decisions, the usual process would be to schedule a t-lang design meeting, and the meeting document should sketch out the design space with at least one concrete proposal for how to answer the question. Given that you are asking for some very over-arching decision that includes everything from cargo implementation issues to compiler bugs, the document should spell out how a concrete threat model answers all these questions. Speaking of cargo, some of these decisions aren't even for t-lang to make, so this sounds like it would need a lot of cross-team coordination. Personally I don't think we gain much from tying together the discussion around hash collisions in the compiler with build script sandboxing. That just makes it even harder than it already is to make progress on any one of these issues. You are conflating a whole bunch of IMO unrelated things under the umbrella of "tension between what the language should ideally offer and what implementations can offer". However, "what the language should ideally offer" is extremely vague, and the sense in which the language "should" (possibly) offer build script sandboxing is very different from the sense in which rustc "should" not have I-unsound issues. There's a reason the lack of sandboxing is not labeled I-unsound: no language promise is broken by a build script that does odd things on your system, the Rust Abstract Machine remains entirely intact. (That won't help a developer whose system gets attacked that way, of course. But it is how we categorize these kinds of bugs in Rust and it has served us well, so I see no reason for suddenly fundamentally changing gears here.) So as a first step to equip this issue with something like a reasonable scope I think it should focus on language-level promises and not drag in all potential security issues ever -- that kind of unscoped discussion has a very low chance of leading anywhere.
I am still waiting to hear a single argument against just fixing the underlying problem here. In which world is it better to keep the current scheme instead of a collision-resistant hash function, no matter the threat model? If the proposed fix had significant downsides, that would be a different matter, but it doesn't. We can have philosophical discussions about security fences all day long, or we can just fix the issue... For other cases there's much more of a trade-off, i.e. it seems clear that we can't "just" switch incremental compilation to use a collision-resistant hash. Similarly, for the questions around |
People like storing type IDs, so increasing the size again from 128 but to 256 bit would be an increase in memory usage and decrease in performance, which is a disadvantage |
I think this suitable regard has not been applied here. Saying that cryptographic security is necessary to defend against malicious inputs while at the same time exploitable LLVM bugs are not a security issues seems inconsistent to me. There might be some tradeoff that implies a nuanced threat model underpinning that decision, but it was not communicated. It just means the hypothetical well-resourced attacker would shift his attention from looking for hash collisions to fuzzing to find something he can use and spending resources on fixing one thing does not improve our overall security situation.
AIUI one of the proposed fixes is storing all the type names in a binary. This will increase binary sizes which is a real issue for embedded systems. If there's a fix that's free in terms of the produced output then I agree that it's not productive to argue further.
If they're outside the threat model then there's no point in making progress on any of them for the sake of security. They can still be valuable for e.g. reproducible builds and downloading prebuilt binary libraries, but if the only argument for them is to defend against threats we don't intend to defend against then that's wasted effort until there's a project goal to expand the threat model in a coherent manner. For example consider the incremental compilation case. If we say that we don't consider malicious inputs there for performance reasons but do consider malicious inputs in release builds then what does that tell us against which threats we want and don't want to defend? Is the argument that compromising developer machines is worth the tradeoff while for release builds distributed to many people isn't? Should we then warn against using
I don't think soundness issues fall under the security policy either. Ok perhaps this is an angle we haven't discussed explicitly before: If T-lang says that TypeId must have at least cryptographic strength, then does that not imply we're trying to defend against malicious inputs here, and therefore it becomes a security policy issue rather than being only an I-unsound issue? Or is it a project goal (Project Goal?) to become secure against malicious inputs in the future and so work should be done in that direction even though we don't promise that yet? But if that were the case then shouldn't it be discussed explicitly, especially considering some of those things are seemingly unfixable? |
There are no plans to increase the size of
It seems rather odd to me to say "you can't fix that soundness issue unless you also fix that other soundness issue". We're constantly fixing non-soundness issues even though critical LLVM issues exist. There's nothing inconsistent about that. The same applies to #129014 which is a soundness issue or just a bug, depending on whom you ask.
Not fixing issues we can fix certainly does not improve our overall security either. This isn't a zero-sum game, we'll fix the issues we know we can fix now and slowly work towards fixing the ones that can't be fixed yet -- and we hope at some point LLVM will fix the ones that we can't fix in Rust. (Except that going in endless circles debating whether we need a "threat model" and what it should be surely takes away resources that could also be used to fix soundness issues...)
First of all, it'd only be the names of types that actually have their But anyway, that's what #129014 is about, not this issue.
It seems odd to me that we wouldn't fix an issue we know how to fix, no mater what our "security policy" or "threat model" or any other document like that would say. Obviously it's a tradeoff, if the cost is too high we'll have to decide whether it's worth it, but no "security policy" will help us make that tradeoff -- the costs and benefits of both options need to be weighed in each individual case. I don't think the "threat model" approach with a black-and-white distinction between "in scope" and "out of scope" is obviously applicable to our setting, nor do I think it is nearly as useful here as it is in other situations. |
The only difference between a 128bit and 256bit hash is getting the probability from "doesn't happen naturally" to "doesn't happen even with deliberate effort". So I see this not as a soundness fix but as an attempt to fend off attackers. Or put differently, what exactly are we trying to fix if it's not attackers? A natural sort of... threat? |
It's not just the hash size that changes, which you know of course. I see this as a soundness fix, but it seems we have not only very different views but even different frameworks to base our arguments on. 🤷 It seems we have exhausted the amount of new arguments that will surface between the two of us. I'll wait and see what other people think, maybe someone will bring in new arguments. |
If you're implying here that the essence of the soundness fix is not just changing the collision probability but also of the quality of the hash then I have not noticed any argument that siphash is inadequate for non-cryptographic use / for all the properties we would need for soundness barring malicious inputs. |
I think the point of this issue is that as-is, I think |
That's not really true, the effort to construct collisions in our current hash function is not very high. By standards for cryptographic hash functions, SipHash-2-4 with an all-zero-key is weaker than MD5, and generating a collision for that is pretty easy these days. SipHash-1-3 is even weaker. I think we can trust the cryptographic experts when they tell us that constructing a collision is not a significant computational effort, it's just a bunch of annoying work nobody has bothered to do yet. |
I've nominated this for @rust-lang/lang discussion. I still think the issue is written in a much too broad way, making it hard to say anything concrete, but the discussion seems to focus on a reasonably concrete question: should the current The relevant facts and arguments seem to be:
@briansmith and @tarcieri argue that the current situation is unsound since it is entirely computationally feasible to construct an unsound program, and therefore this is a soundness bug and should be fixed. Furthermore, even if we restrict ourselves to "naturally" occurring programs, a 128 bit hash is too small: @the8472 and @Noratrieb say that we should not consider something a soundness issue if it takes specifically constructed malicious input to exploit the bug. So, we'd like to know from the lang team: should the current implementation be considered sound, or should we strive for soundness even against examples constructed specifically to exploit the hash function? Is there an overarching threat model that we uniformly apply to all such questions (that could cover not only TypeId and incremental but also build scripts and would interact with LLVM's stance on malicious inputs), or are we doing case-by-case decisions? If using the current approach is fine, a bunch of sub-questions arise:
|
Can we separate out the questions regarding |
The issue originally asked a question spanning TypeId and incremental compilation and build scripts and a ton of other things, so I already tried hard to narrow down its scope.^^ I didn't want the summary to be completely non-representative of the OP here. But we can point out to t-lang that they might want to further subdivide this question, as you have just done. |
From above:
Following the link, the claim is:
So, I assume you mean that finding collisions in SipHash-2-4-128 with an all-zero key is easier than finding collisions in MD5. Can you substantiate that? The best attack on MD5 finds collisions in about 2^18 calls to the compression function. |
I am relying on domain experts here, I am not a cryptographer. So that would be a question for @briansmith. But given that even the author of SipHash explicitly states that SipHash is not a cryptographic hash function and, with a fixed key, therefore not even designed to be collision-resistant, I would say the burden of proof is on the side of anyone claiming that it is hard to find collisions in SipHash with a fixed key. MD5 was at least designed to be collision-resistant. We shouldn't by default assume that a function has properties that it was never designed to have. |
SipHash-2-4-128 is not collision resistant for an obvious reason. Its output size is only 128 bits. So by the birthday bound, it can't do better than approximately 2^64 security against collision attacks, which is not considered collision resistant these days. (MD5 is fundamentally broken and falls far below the birthday bound.) |
MD5 is fundamentally broken wrt a criterion that SipHash was never even intended to have. Without evidence to the contrary, SipHash should be treated like a non-cryptographic hash function.
EDIT: I am told that that's too naive of a view and there are features of SipHash that make it "obviously" a lot better than a non-cryptographic hash. I'll stay out of this sub-discussion and leave it to the crypto experts. :)
|
Not really. It isn't really about either of those specifically It's about classes of things we want to fix and which we don't or shouldn't because it doesn't make sense to spend effort on them. If we say
then we have a lopsided defense, inconsistent decisionmaking or something along those lines. Similar lines can be constructed between other security-related decisions, e.g. LLVM's security stance is a big one here. I'm trying to encourage looking at the bigger picture here to check if individual decisions make sense in context.
I think this is somewhat of a distraction. For several reasons:
So first we should decide what our security goals are, then what properties a hash function must have to achieve that, then whether the chosen hash function provides that desired level.
It's already fed in as input as part of the metadata hash. That probably has slightly different implications wrt. exploitability than feeding it in as key, but it is changing the result.
Hrrm, I haven't checked the math. Isn't that only correct if you're generating a single huge program containing around 2^64 inputs, which would exhaust memory of any modern computer? If you're generating many smaller individual programs then they're only getting chances to collide within themselves which not only lowers the individual collision probability but also in aggregate. Additionally there are hardware failure rates (such as memory bit flips) to consider which might even exceed those.
Natural failures are part of a threat model and the acceptable failure rate should be defined and then we should consider if that's achievable, yes. |
Bjorn was talking about the crate version, I was talking about the compiler version. AFAIK currently our hash stays stable across compiler versions, but I may be wrong about that. (Also, I was talking all soundness-relevant hashes, not just TypeId.) |
rust/compiler/rustc_span/src/def_id.rs Lines 132 to 134 in c416a6f
|
The compiler version is always hashed in by the compiler. The crate version is hashed in only when using cargo as build system, but any build system which allows multiple crates with the same name needs to hash in something unique between those crates through |
Ah nice. (One of these comments you are quoting is 8 months older than the other, so it wasn't clear at all from your earlier message.) I especially like the part about "includes the commit hash for the official builds and thus isn't predictable for future versions". :) That said, this does not affect the hashes used for incremental builds, or does it? |
It doesn't affect all hashes, but many queries directly or indirectly use DefId, which when getting |
Here's the math: #![feature(f128)]
pub fn est_log2_collision_prob2(m: u16, n: u16) -> f128 {
let (m, n) = ((m as f128).exp2(), (n as f128).exp2());
(1. - (-(n * (n - 1.)) / (2. * m)).exp()).log2()
} Assuming a 128-bit random oracle, the probability that we'll see a collision given
That is, in our use case, a program that contains one million types has one chance in Footnotes
|
(That's what I suggested in point 1 in #10389 (comment).) It is also what the rustdoc documentation for
So the issue regarding
Yes, this is my position. When I mentioned MD5 before, here's what I meant: Look at the number of rounds in MD5, and compare that to the number of rounds in SipHash. Also, look at the complexity of each round in MD5, and compare that to the complexity of each round in SipHash. Clearly MD5 is doing MUCH more work than SipHash. Now take a look at the for SipHash; if SipHash had MD5-level security then it would be an inappropriate design for its performance/security goals, so it would be reduced in strength in its design until it is much weaker than MD5. So, it is extremely unlikely anybody is going to share a well-reasoned argument that SipHash-with-a-fixed-key is stronger than MD5 because it very likely isn't even close (indeed, such an argument could literally be somebody's entire PhD). Would we be happy with MD5-level collision resistance? I doubt it. So, don't waste time chasing this. In other words, the argument is intended to help people avoid wasting time going down the wrong path. |
This argument is too simplistic. Even keyed MD5 is broken. The preimage security of MD5 is broken. Using it in HMAC (where MD5 is applied twice, with a key) is broken.1 Analysis in 2014 found a chosen related key collision2 (i.e., not a "real" collision) in SipHash-2-d. If the authors could have easily found a fixed-key collision attack on SipHash-2-d below the birthday bound, I'm sure they would have reported that much more impressive result instead.3 Meanwhile, we know from 2013 work that full MD5 collisions can be found on a graphing calculator. Observe that SipHash has a 256-bit internal state (to 128-bit for MD5)4 and the message injection offers much less of the kind of control over that state that you want when trying to find internal collisions.5 My feeling is that it would be difficult to support with cryptanalysis the claim that SipHash is weaker than MD5, and so we shouldn't be making confident-sounding statements to that effect, setting aside whether this even matters for the threat model here. Footnotes
|
Why exactly are we spending precious time arguing about whether SipHash is more secure than MD5? It's clear that neither of the two are cryptographically secure, and that's not the point of the issue. The point is whether we want cryptographic security in the first place. The precise algorithm is an implementation detail that doesn't matter. |
I tend to agree. However, you had earlier mentioned:
As reflected in @RalfJung's comment here, the difference between 2^64 (the birthday bound) and ~2^18 (MD5 security) matters to some people for this line of argument, even if 2^64 is not itself cryptographically significant. That's not an argument I would make. But if this does matter to people, then it's worth questioning the claim. |
I think even if TypeId was MD5 (which it obviously will never be), my point still stands. 2^18 is, as the name implies, security. Against adversarial inputs. Which I did explicitly not consider to be in scope. For people not explicitly trying to attack it, the bound is much higher (maybe the birthday bound? maybe something a bit lower? I'm not sure about that). If you think that our use of SipHash is not secure against Rust programmers not actively trying to crack it (and we know that sufficiently bad code can often get close to adversarial), then we should it course change it. Personally, I would argue that threat models against the compiler are a T-compiler issue, so I think this shouldn't involve T-lang at all, other than saying that TypeId must obviously be unique. How and against what the compiler ensures this should be up to T-compiler. |
The math on that is here. Assuming SipHash-c-d-128 is indistinguishable from a PRF, the probability of a random program with an absurd 1M types hitting a natural collision is one in 2^89 (i.e. "never"). |
I would just like to mention that C++ uses more than a simple non-cryptographic hash for ensuring the safety of things like it's I would also like to point out that we in the past have taken steps to let pathological (massive, autogenerated code based on human would ever deign to write) code compile more easily, so we should not be intentionally needing our security and making it potentially unsound. We do not argue against miscompilations (no matter how rare or tortured) as being outside of our "threat model", whatever that means. If I were to come up with a "threat model", I would say that Rust should not miscompile (weird but no malicious) Rust programs and crates. A build script that deleted your hard drive and installs a rootkit is the Rust compiler exacting running the code you told it to, even if you would rather it didn't. A valid program with some weird |
Note that when we talk about collision resistance, we mean specifically resistance against the intentional (i.e. malicious). If you're not worried about malicious programs and crates, you don't need to worry about colliding Footnotes
|
That would depend on whether our hash of choice has any pathologies that may cause certain patterns to exhibit more collisions no? For example, I would be completely fine with a truncated Blake hash, because I am fairly confident in my belief that are no pathological cases there. |
There are no known such pathologies in SipHash1, and there are good reasons to believe that no such significant pathologies may exist. SipHash is a cryptographically-strong pseudo-random function (PRF). That is, it claims (with recommended parameter choices) that, given some random fixed secret key, its behavior is indistinguishable (for a bounded adversary) from that of an actually random function. This claim is unbroken in the literature and has comfortable security margins for the recommended parameter choices. If the output of SipHash exhibited any patterns with higher-than-chance probability, that would break the claim of PRF security.2 That is, finding any such pattern would be an enormous advance compared to the current state-of-the-art cryptanalysis. Footnotes
|
FWIW we're currently not using the recommended parameter choices. We're using SipHash-1-3 instead of SipHash-2-4. |
So I would suggest that for the purpose of threat model discussion we round SipHash's property's off to "a statistically ideal 128bit hash function, not robust against adversarial inputs". The choice of hash function is an implementation detail and can be changed later if it's found to fall short. Or if a better alternative is found. SipHash is interesting as an informative example around speed-quality tradeoffs. It sits between "obviously too weak" and "obviously correct". Practically it's of course also relevant because it's the current implementation, but in principle we could replace it with something in the same category, not just with something in a more secure category. This could be the order of decisions to make:
Independently there can be quality of implementation work, finding alternative approaches that do better than the required minimum as long as we don't have to pay significant penalties for them. |
Yes. And if that worried us enough, that would be trivial to change. Probably I don't find it that worrying, but I'd also probably be interested in a representative benchmark with e.g. SipHash-1-5-128, SipHash-2-4-128, etc. to see what it would cost to remove some caveats. Having numbers on the statistical distribution of the length of the inputs used to compute Some analysis: In SipHash-1-3-128 (what we use), the last 7 bytes of input go through only 4 rounds to emit the first 64 bits of output. Then 3 more rounds happen to emit the last 64 bits of output. To give some intuition, here's what a one-bit change looks like propagating through each of 6 SipHash rounds (this visualizes the internal 256-bit state): We can see immediately that 3 rounds doesn't achieve full diffusion. At four rounds it's hard to eyeball, but we know from cryptanalysis that there are practical distinguishers. These rely on carefully crafted inputs. But things get an awful lot harder with each additional round. There are no reported distinguishers on five rounds. So, as it applies to us with SipHash-1-3-128, I'd say the first 64 bits of the Since 3 rounds obviously isn't enough for a secure pseudo-random permutation, I'd also suspect there may exist some theoretically detectable correlations between the first 64 bits of output and the second 64 bits of output, though these could still be pretty hard to find in context. The most cost-effective way to increase confidence would be increasing the number of finalization rounds. SipHash-1-5-128 would have many fewer caveats, at a cost of 4 total additional rounds per message. Still, given the amount of headroom here, it would surprise me to ever see a SipHash-1-3-128 |
That issue suggests using a 256-bit hash which would double the size of TypeId, no? |
No, we'd use 64 of the 128 bits of TypeId to store a pointer to the hash. |
FYI: t-compiler had a steering meeting on the implementation of |
From #129014 and #10389
I think we should take a step back and take a look at the bigger picture. I have tried to get people to spell out their threat model so that we can apply it as a razor to close/postpone all bug reports that are outside that model rather than making incoherent attempts to fix some things but leaving similar or more easily exploitable things open for the foreseeable future.
Previously the lang-team decided that it would be sufficient to use a 256bit cryptographic hash. If we assume – as #129014 does – that that statement was also meant as a lower bound required to satisfy T-lang's threat model then this has implications about what kind of things a language implementation should defend against. But afaict the rationale was not documented, so it remains unclear exactly against which threats an implementation should defend.
On the other hand it has been said in several places – e.g. in the context of build script sandboxing ¹ ², I-unsound issues, LLVM bugs and possibly incremental compilation – that rustc currently does not and cannot promise to be robust against malicious inputs.
So it seems like things are underspecified and there's tension between what the language should ideally offer and what implementations can offer. This in turn raises the question how the gap between the upper bound of what the spec allows/wants vs. what implementations offer should be handled.
It's also unclear to me to which extent this is a lang vs. a compiler decision.
Now to the concrete case of TypeId:
At first blush it seems that as long as the compiler does not try to resist malicious inputs on many fronts (proc macros, build.rs, compiler bugs, language bugs, llvm bugs) it does not make much sense that an actual TypeId implementation should spend much effort defending against that. And given the de-facto security levels that the compiler can offer a 128bit hash and only accounting for non-malicious inputs should remain sufficient for the moment, which would also be consistent with previous T-compiler decisions.
There might be some arguments that TypeId is different from other holes, e.g. as @briansmith tried here to gesture at some difficulty to detect exploits but there were counterpoints regarding the reliability and the possibility of such a scenario and we failed to reach agreement in subsequent discussion.
Imo the question What makes TypeId sit on one side of the security fence while other things sit on the other side? remains unanswered.
Perhaps it would be better if on the language level it is specified as globally unique value/exact comparison without mandating a particular implementation. That would leave the probabilistic reasoning, choice of comparison method etc. to the compiler. It could then decide to switch to a stronger implementation when it makes sense along with other hardening efforts.
The text was updated successfully, but these errors were encountered: