Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement OpenPGP Signature generation using PGPainless #48

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

vanitasvitae
Copy link
Contributor

Hey!

I noticed, that packager uses Bouncycastle to sign packages.
Using Bouncycastle for OpenPGP operations has the caveat that BC is pretty low-level and error prone.
For example, it will not warn you if your OpenPGP key is expired.

PGPainless is a wrapper around BC that is designed with ease of use and correctness in mind (see these test results).
It abstracts away the hard parts of BC behind a simple API that is designed to be sane and secure by default.

This PR replaces the use of BC for signing with PGPainless. Since PGPainless internally uses BC, this PR implicitly bumps BC from 1.71 to 1.73.
The API of some methods has changed a bit. PGPainless works on PGPSecretKeyRing objects instead of PGPPrivateKeys. The reason is, that an OpenPGP key could have more than one subkey. PGPainless will automatically select the appropriate signing subkey, validating the whole key.
Further, I removed the HashAlgorithm argument, since PGPainless will automatically select a signature hash algorithm based on the keys preferences and an algorithm policy, which prevents the use of weak algorithms (such as SHA1).

Let me know what you think of the patch :)

Copy link

@wiktor-k wiktor-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice. There's a typo in the PR title.

@vanitasvitae vanitasvitae changed the title Implement OpenPGP Signautre generation using PGPainless Implement OpenPGP Signature generation using PGPainless May 30, 2023
@ctron
Copy link
Contributor

ctron commented May 30, 2023

Thanks for the PR. IIRC I had a quick exchange with "Neal" on the fediverse (https://mastodon.dentrassi.de/@[email protected]/110379136551193171) about such a PR.

There are two aspects IMHO: a) making it easier to work with crypto (thus reducing errors) … b) reducing the number of dependencies (thus reducing the code base which might contain bugs).

To my understanding, this PR (using pgpainless) adds a new layer on top of bouncycastle (BC) not replacing it.

There's a bunch of tasks which we have to do here:

  • You need to sign the Eclipse Contributor Agreement (ECA): https://accounts.eclipse.org/user/eca (IIRC the link only works once you logged in with your Eclipse Foundation account.
  • I (and probably some more people) need to review this (which might take a bit of time, just to set the right expectations about how fast this will go. And please to ping me if you think this is stalled).
  • I need to test this with the rpm-builder Maven plugin (https://github.com/ctron/rpm-builder). That's one of the dependents of this library, and this has a direct impact on this.
  • I need to check the license of pgpainless (so that we are compatible with the EPL)
  • You might want to summarize (bullet point style) "why" this is change is an improvement. I think you already did that in the initial comment, but just to call it out explicitly when making a decision.

So just to be clear, I am still not 100% convinced that adding this out-weights adding a new dependency. I think it makes sense to have an open discussion around it, being able to prove/document why this makes sense. I hope you understand this.

If you want, you can also join the Matrix room that we have (https://matrix.to/#/#packager:matrix.eclipse.org), which might make a conversion (or pinging) easier than on GitHub.

@ctron ctron requested a review from dwalluck May 30, 2023 08:32
@ctron
Copy link
Contributor

ctron commented May 30, 2023

I did a first test just to see what the impact is. I did this in combination with the rpm-builder Mojo, as that's my primary use case. Here are some findings:

  • The new dependency (pgpainless-core) isn't an OSGi bundle. That breaks compatibility with people using OSGi, which I do know people did in the past. Not sure how much emphasis we should put on this, but this would definitely break this.
  • The concept switched from using the "private key" to using the "secret key ring" plus the "protector". Makes sense to some degree, but also breaks some APIs in the process. RsaHeaderSignatureProcessor and RsaSignatureProcessor don't have the new API, which would make it impossible to use the new version with the rpm-builder.
  • It might be worth considering deprecating the old APIs (using private key). In any case, that something to leave when finishing up.

I think it might make sense to combine this PR with a PR on the rpm-builder. I know the rpm-builder isn't part of this project, but it is one of the main users.

@vanitasvitae
Copy link
Contributor Author

Hey,
sorry for the slight delay.

Benefits of using PGPainless vs. Bouncycastle:

Using Bouncycastle directly leaves bigger room for error

While I see that you are using the BC API mostly correctly in the existing codebase, there are some places where the existing implementation is not acting ideal:

  • Usage of SHA-1 as signature hash algorithm
    SHA-1 is mostly deprecated, and modern implementations already started rejecting signatures made using SHA1. Sequoia-PGP (which is now used in the signautre-verification part of RPM in Fedora 38) for example does, and published a great blog post about their challenges with dealing with packages using outdated crypto.
  • BCs isSigningKey() is misleading
    The existing code used PGPSecretKey.isSigningKey() to check, whether a subkey of a secret key is capable of signing. The issue is, that the BC method only checks, whether the key uses an algorithm which can be used to create signatures, not whether the key is actually designated for signing (e.g. consider an RSA key with a CERTIFY_ONLY keyflagged primary key and a designated SIGN_DATA keyflagged subkey. Here the existing code will use the primary key to generate the signature, which compliant implementations (e.g. sequoia-rpm) will later on reject). PGPainless is doing the correct (and complicated) thing, validating the whole secret key and finding the right signing subkey to use.

PGP is complex

  • The existing code does not check key validity.
    As far as I can tell, the existing code will not check, whether the signing key is actually usable (non-revoked, non-expired).
    This can lead to issues with the key only being discovered when someone tries to install a package.
  • PGPainless uses sane algorithm and key policies and keeps updating its default policy wiht each release
    With the existing implementation, e.g. hash algorithms are hard-coded, or more specifically need to be chosen by the user. This can quickly lead to used algorithms getting outdated (see SHA-1).
    Further, PGPainless will not only check for outdated algorithms, but also outdated key lengths and for example refuse to use RSA keys smaller than the currently recommended minimum length of 2000 bits.

Arguably, all these issues are fixable in code here, since producing signatures is way easier than verifying them, but I still think it is worthwhile relying on a library whose sole purpose is to get the PGP part right :D

Regarding OSGI: I need to look into this, as honestly I never dealt with OSGI bundles before. If it isn't too hard though, I can see what I can do to make PGPainless compliant.

I think you are right with regards to the API breakage with PGPPrivateKeys. I will restore the old methods and mark them as deprecated instead, like you proposed. There is no need to rush this PR, just merge it once you are happy with it :)

@vanitasvitae
Copy link
Contributor Author

I added back some of the old signing logic and moved some new logic to dedicated classes.
Retrofitting the RpmFileSignatureProcessor with the old API is pretty ugly, so I'll think about a cleaner way for a bit and update the PR when I find a nice solution that does not entail duplicating most of the class.

I looked into OSGI. Unfortunately there doesn't appear to be an official Gradle plugin to build OSGI bundles, so the process would be very manual-ish. Do you know of a place to look that I might have missed?

Another question that crossed my mind: The current implementation tags signatures using the RpmSignatureTag.RSAHEADER. Is that header appropriate to use with e.g. EdDSA signatures? Or is RPM signing limited to RSA and DSA?

I made some changes to PGPainless which this PR will benefit from and which will make it into a new release I will be doing soon :)

@ctron
Copy link
Contributor

ctron commented Jun 7, 2023

No worries about delays. I am the one not finding enough time for this :) Btw, I will be out for a week and a half, so don't expect too much during that time.


Sounds good. I also think that it may be ok to trim down old APIs. As long as there's an alternative patch and we still can support the RPM builder use case.

Regarding OSGi (and Gradle), stackoverflow claims it is as easy as apply plugin: 'osgi'. Not sure that is true. Handling that manually isn't a great experience. I think you should avoid that.

Bnd used to be a thing, looks like it still is: https://plugins.gradle.org/plugin/biz.aQute.bnd

And again, not having OSGi wouldn't be a roadblock for me. Maybe someone asks for it, then we would need to figure it out. Just noted that this would be the case.


Following up on the "why":

Tell me about SHA1 and GPG 😁 The reason I don't have too much time for this is SHA1, v3 signatures, gpg and sequoia. Just in the Rust world, which I prefer a bit more.

Arguably, all these issues are fixable in code here, since producing signatures is way easier than verifying them, but I still think it is worthwhile relying on a library whose sole purpose is to get the PGP part right :D

I agree there. Every library should have a focus/scope and PGP isn't the focus of "packager", so that makes sense. And to be honest, the more I see in Sequoia, the less I want to do myself in this area :)


Regarding the compatibility with the RPM builder, I was hoping to move forward to the new implementation. Which requires some changes in the RPM builder too. Otherwise there wouldn't be too much usage of the new feature. So kicking out the old APIs is fine. And I am not a real fan of the naming of SigningStream2.

I will try to adapt this on the RPM builder, but I just don't know how/when I will find time for this.

@ctron
Copy link
Contributor

ctron commented Jun 7, 2023

I will follow up with a few things I notice as I progress (if that's not your style, then please just ignore it, I will try to sum it up later):

  • I just noticed that some of my confusion came from the new classes PgpHeaderSignatureProcessor and PgpSignatureProcessor. Compared to the Rsa* ones. I think it makes sense to replace the old (Rsa) ones. For one the naming seems off, it's feels confusing.
  • On the PgpHeaderSignatureProcessor constructor we should always have the protector argument, right? If a user doesn't need it, there is a way. But defaulting to "unprotected" (with no alternative) seems weird.
  • The signing operations take in a "secret key ring", but don't seem to specific which key to use from that. How is supposed to work?

@ctron
Copy link
Contributor

ctron commented Jun 7, 2023

  • The signing operations take in a "secret key ring", but don't seem to specific which key to use from that. How is supposed to work?

Ok, I guess I figured it out: I assumed "key ring" would contain multiple keys. But that seems to be PGPSecretKeyRingCollection in this case.

@ctron
Copy link
Contributor

ctron commented Jun 7, 2023

I created a PR which should adopt those changed in RPM packager: ctron/rpm-builder#78

@vanitasvitae While the tests run fine, I would appreciate second pair of eyes.

@vanitasvitae
Copy link
Contributor Author

vanitasvitae commented Jun 7, 2023

Regarding the compatibility with the RPM builder, I was hoping to move forward to the new implementation. Which requires some changes in the RPM builder too. Otherwise there wouldn't be too much usage of the new feature. So kicking out the old APIs is fine. And I am not a real fan of the naming of SigningStream2.

Yeah, the naming is unfortunate. I did it because I did not want to remove the old SigningStream to keep the old API.

I just noticed that some of my confusion came from the new classes PgpHeaderSignatureProcessor and PgpSignatureProcessor. Compared to the Rsa* ones. I think it makes sense to replace the old (Rsa) ones. For one the naming seems off, it's feels confusing.

I will remove the RSA classes then :)

On the PgpHeaderSignatureProcessor constructor we should always have the protector argument, right? If a user doesn't need it, there is a way. But defaulting to "unprotected" (with no alternative) seems weird.

Good point!

The signing operations take in a "secret key ring", but don't seem to specific which key to use from that. How is supposed to work?

Magic :)
BCs naming conventions on keys is a bit confusing. A "KeyRing" is really a primary key with optional subkeys.
What other implementations call KeyRing is named KeyRingCollection in BC. See https://github.com/pgpainless/pgpainless/wiki/Terminology
PGPainless inspects the whole secret key ring (the users key) and identifies the subkey that is dedicated for signing. It does so by checking the self-signatures on the keys, looking for a key marked with the "SIGN_DATA" key flag.
Similarly, PGPainless also decides which hash algorithm to used, based on its policy and the hash algorithm preferences of the key.

@Valodim
Copy link

Valodim commented Jun 7, 2023

PGPainless inspects the whole secret key ring (the users key) and identifies the subkey that is dedicated for signing. It does so by checking the self-signatures on the keys, looking for a key marked with the "SIGN_DATA" key flag.

What if there are multiple candidates? (this is an age-old question)

@vanitasvitae
Copy link
Contributor Author

What if there are multiple candidates? (this is an age-old question)

PGPainless currently signs with all available signing keys, if no specific key-id is given.

@Valodim
Copy link

Valodim commented Jun 7, 2023

I believe that is the correct behavior for a general purpose implementation 👍 I'm not sure if it is the correct one in this case, but I would suggest if only one signature is desired the correct way to address that would be to not have the other signing subkeys available for signing in this context in the first place.

@vanitasvitae
Copy link
Contributor Author

vanitasvitae commented Jun 7, 2023

Good point, thanks for bringing this up.
Actually, the way I use PGPainless in this PR only outputs a single signature, even though multiple are created. I should optimize that and only sign with a single subkey (the first one perhaps?).

Edit: And the ugly iterator().next().iterator().next() thing is going away once I make another PGPainless release :)

@Valodim
Copy link

Valodim commented Jun 7, 2023

"The first one" is likely what a BC implementation did before as well, but it's not exactly a well defined criteria. Then again, any signing subkey is as good as the next so it's not "wrong" to do like that per se 🤷

@ctron
Copy link
Contributor

ctron commented Jun 7, 2023

Good point, thanks for bringing this up. Actually, the way I use PGPainless in this PR only outputs a single signature, even though multiple are created. I should optimize that and only sign with a single subkey (the first one perhaps?).

Edit: And the ugly iterator().next().iterator().next() thing is going away once I make another PGPainless release :)

They way I use it now in the RPM builder is to load the full key chain collection, pick the key by ID (as provided by the user). That results in a "key ring", which to my understanding now still could have multiple keys for signing.

Couldn't we simply allow the user to select this (via an Optional, falling back the the first suitable)? If the user provides a key ID, then the "key ring" containing this key should be picked, and when signing this specific key should be used. If that's not possible, it should fail.

That way we would have both: sane defaults & control by the caller

@Valodim
Copy link

Valodim commented Jun 7, 2023

Yes, that's right. The only issue is that "the first" is essentially picking one (deterministically) at random. As is any other criteria - the newest, the one with the longest expiration, etc, none of those qualify as "least surprising" to the user.

If there is already an option to pick, then requiring that argument if there are multiple choices would avoid any ambiguity, which sounds like a good thing for me. Most folks don't have two signing subkeys on their certificate, and those who do for one reason or another would just have to pick.

If that sounds too bothersome for users of such keys, it would be good to at least output a hint to the user that this is happening when it is 👌

@dwalluck
Copy link
Contributor

OK, I still see some issue. We have RsaHeaderSignatureProcessor but no DsaHeaderSignatureProcessor. And it looks like there is RsaSignatureProcessor processor, but it's not actually RSA-specific.

So, maybe we can just make RsaHeaderSignatureProcessor support DSA by also modifying finish() there?

@vanitasvitae
Copy link
Contributor Author

vanitasvitae commented Mar 30, 2024

My idea behind introducing PgpHeaderSignatureProcessor was to relieve the user from needing to make a decision on the algorithm by determining the used public key algorithm on the fly.
Same with PgpSignatureProcessor. The RSA variants were already there, I just added the Pgp classes as generalizations of them to be algorithm agnostic :)

@dwalluck
Copy link
Contributor

My idea behind introducing PgpHeaderSignatureProcessor was to relieve the user from needing to make a decision on the algorithm by determining the used public key algorithm on the fly.
Same with PgpSignatureProcessor. The RSA variants were already there, I just added the Pgp classes as generalizations of them to be algorithm agnostic :)

So, are we deprecating the old classes? I don't see @Deprecated there.

@vanitasvitae
Copy link
Contributor Author

Deprecating them would be an option, yes. I'll create another commit :)

@dwalluck
Copy link
Contributor

dwalluck commented Apr 1, 2024

@ctron So I am OK with this, but do you want to test against rpm-builder first?

<version>${bouncycastle.version}</version>
<groupId>org.pgpainless</groupId>
<artifactId>pgpainless-core</artifactId>
<version>1.6.7</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be provided as a variable.

@@ -8,8 +8,8 @@
* SPDX-License-Identifier: EPL-2.0
*
* Contributors:
* mat1e, Groupe EDF - initial API and implementation
* Jens Reimann, Red Hat, Inc
* mat1e, Groupe EDF - initial API and implementation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't change existing contributor lines, but add yourself as one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This must have been some spotless apply removing the whitespace, sorry.

@ctron
Copy link
Contributor

ctron commented Apr 2, 2024

@ctron So I am OK with this, but do you want to test against rpm-builder first?

Yes, I do think we need a change that works for RPM builder too. Otherwise we're just breaking the current API without having a way to prove that the new API supports existing applications (rpm-builder being one of them).

I am also still not a fan of SigningStream2. Just marking SigningStream as deprecated and pointing to SigningStream without any further comment doesn't seem to make much sense to me.

And just dropping BC for a new dependency, handling all the signing stuff, without some proper reasoning just feels wrong.

I agree that that there are some things to clean up in the current code (like the SHA1 -> SHA256 stuff, and more). But maybe we should split this up. And maybe it makes more sense to provide an additional option (using pgpainless) rather than swapping out the current approach.

As said before, I am ok with a breaking release. I am ok dropping OSGi (and things like Java 8). And I also think it good to enable the use of pgpainless. But I am not so convinced on simply replacing the current code and breaking the API at the same time is the best approach.

My proposal would be:

  • Fork off some those minor improvements into a new PR (SHA like stuff)
  • Re-work this PR in a way that it enables a user to use pgpainless (optional dependency)
  • Show that the new APIs support the existing use case, by having a PR on rpm-builder which makes use of this

I personally don't have time to actually do the implementation. But I can definitely help, test, and review.

@vanitasvitae
Copy link
Contributor Author

I'll create another PR with the small improvements separated out :)
It might not be trivial to transform the PR into one that offers PGPainless as an optional dependency, but I'll try my best :D

Some background for why I chose to create SigningStream2:
The old SigningStream class receives only a PGPPrivateKey object, along with the digest algorithm for signature generation.
This is unfortunate, as the PGPPrivateKey class has no reference to the corresponding PGPPublicKey, which in turn contains self-signatures, which carry metadata such as algorithm preferences and features. This means, that the algorithm "negotiation" would need to be performed by the user before the instantiation of the SigningStream.

PGPainless follows another paradigm. Its philosophy is to take the burden of "key interpretation" (that is decide which algorithms to use, etc.) off the user. PGPainless' signing API therefore takes a PGPSecretKeyRing (the whole secret key + subkeys), where the public key components and signatures are accessible. It then evaluates the signatures and deduces the preferred algorithms, which subkey to use for signing etc. So the user does not need to know anything about the inner workings of OpenPGP, they just need to supply the key and passphrase if needed.
For that reason I decided that it would not make sense to modify the existing SigningStream and to instead create a new class which follows PGPainless' paradigm. Sure, perhaps it would be possible to cram both implementations into one class with multiple constructors and a lot of if-branches, but as I'd consider the "manual approach" to be unfavourable and worthy of deprecation anyways, I figured a new class that would at some point replace the old implementation would be the better way to go.

@ctron
Copy link
Contributor

ctron commented Apr 2, 2024

I see the benefit of SigningStream2! And I think it might make sense to improve SigningStream to be able to do the same, without the need for pgpainless. And I also think it would be beneficial to allow the use of pgpainless of the user wants to do that.

What I am concerned about is adding a new dependency handling quite sensitive material, for the benefit of convenience. I raised that concern initially, and I think last week we saw a good reason of "why". On the other side, I also see the benefit of making pgp less painful, I really do. So I think it makes more sense to let the user choose.

@vanitasvitae
Copy link
Contributor Author

vanitasvitae commented Apr 2, 2024

I understand your points. Especially given the xz debacle, I expect scrutiny for my PRs :D

I'm struggling a bit getting a good idea of how you'd like me to proceed now though. Do you want me to move the new classes into a separate module?

My first impulse was to have a factory class, which can be overridden by the PGPainless implementation, which would swap in the PGPainless classes, so that the user can chose between implementations by swapping out the factory. However, such a class currently doesn't exist and it appears as if the user needs to assemble their signing pipeline by hand.

How would you proceed?

Edit: I would like to work with an API like "Here is a key (file/input stream), here is a passphrase, here is an optional digest algorithm and optional key-id/fingerprint, now give me back some SignatureProcessors / SigningStream".

@dwalluck
Copy link
Contributor

dwalluck commented Apr 2, 2024

I am also still not a fan of SigningStream2. Just marking SigningStream as deprecated and pointing to SigningStream without any further comment doesn't seem to make much sense to me.

I agree, it could use a more descriptive name

@vanitasvitae
Copy link
Contributor Author

I'm working on a new PR which introduces some factories :)

@dwalluck
Copy link
Contributor

dwalluck commented Apr 2, 2024

How would you proceed?

The way I understood it, PGPainless would be an optional dependency <optional>true</optional> in the pom.xml of packager, but if the user adds the jar to their own classpath, then the code would detect the main class and enable that automatically (or, we can set a property, or whatever).

A factory is more of a design issue, but the API should not need to change from a user perspective in order to use PGPainless. If possible, rpm-builder can get support without any changes except adding PGPainless to its own pom.xml.

@vanitasvitae
Copy link
Contributor Author

I see.
I'd propose the following:

  • I'll create a new PR, which introduces Factory class interfaces for SigningStreams and SignatureProcessors. I'm already working on this and its going well.
  • There will be a factory implementation which uses BC only. This resides next to the interface and will be the default.
  • Downstream projects needs to provide the signing key as PGPSecretKeyRing instead of PGPPrivateKey.
  • I'll introduce a new module which implements the factories using PGPainless.
  • Downstream projects would need to change their code to rely on the factories instead of instantiating the BC classes.
  • You contribute the class detection, as I have no idea how to do that :D

@vanitasvitae
Copy link
Contributor Author

vanitasvitae commented Apr 3, 2024

I think I misunderstood the use of the key-id argument.
So far I was under the impression that the user passes in a file containing a single signing key, along with a key-id of the (sub-)key they want to use for signing.

After taking a look at how rpm-builder sets up the signing, I now understand that the file the user passes in may contain many keys (i.e. their GnuPG key ring) and that the key-id is used to identify the particular OpenPGP key that shall be used for signing. This makes much more sense.

But now I wonder: Lets say the user has the following key on their key ring (ignore the non-hex fingerprint):

0xPRIMARY (Certify Others)
- 0xSIGNING (Sign Data)
- 0xENCRYPTION (Encrypt)

It should be obvious, that the only key usable for signing in this case is the subkey 0xSIGNING (though the existing implementation would erroneously use 0xPRIMARY if the user passed in 0xPRIMARY).

My proposal would be: If the user passes in 0xPRIMARY or 0xSIGNING, use the subkey 0xSIGNING for signing.
If the user passes in 0xENCRYPTION, fail with an error.

Alternatively, I could imagine to still use 0xSIGNING in this case, although this might obfuscate some configuration errors.

Now the last question is, how to handle the following key:

0xPRIMARY (Certify)
- 0xSIGNING1 (Sign)
- 0xSIGNING2 (Sign, newer binding time)
- 0xENCRYPTION (Encrypt)

Should the implementation always use the newest signing key here?
What if the user explicitly passed in either 0xSIGNING1 or 0xSIGNING2?
Does RPMs use of OpenPGP allow for multiple signatures?

@vanitasvitae
Copy link
Contributor Author

I created #56 as an alternative to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants