Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: pkcs11-tool based HSM support for factory #278

Merged
merged 17 commits into from
Oct 5, 2023
Merged

Conversation

vkhoroz
Copy link
Member

@vkhoroz vkhoroz commented Sep 27, 2023

An idea is to continue building Fioctl statically, and provide HSM support by wrapping pkcs11-tool.
The tool is available for Linux, Windows, and Darwin; a user can also build from source.

The PR contains 3 implementations:

  • Our current/legacy Bash based PKI, which is now turned off. The HSM support for it was fixed, and script running made more secure.
  • The Golang based implementation was enhanced with pkcs11-tool to support HSM devices, and turned on for all platforms.
  • An alternative native Golang implementation via CGo package is also added, but turned off by default as it only works for a dynamically linked (not static) binary.

I find it reasonable to keep all 3 implementations for reference.

@vkhoroz vkhoroz self-assigned this Sep 27, 2023
@vkhoroz
Copy link
Member Author

vkhoroz commented Sep 27, 2023

I took a part of the refactoring made by @detsch from #270 and replaced the HSM implementation with calls to pkcs11-tool.

The result is a working solution as confirmed by a unit test, which is able to generate a valid certificate.
It is not at all complete and needs to be enhanced a lot (e.g. default tool path for Windows, and ability to provide custom path, check if key already exists etc etc).
But, it saves our lives from the CGo cross-compilation nightmare and a necessity to provide separate binaries for HSM.

Steps to test on Ubuntu:

# Initialize environment
sudo apt install softhsm2 opensc
mkdir -p ${HOME}/.softhsm/tokens
echo "directories.tokendir = ${HOME}/.softhsm/tokens" > ~/.softhsm/softhsm2.conf
export SOFTHSM2_CONF=$HOME/.softhsm/softhsm2.conf
softhsm2-util --init-token --slot 0 --label mine --so-pin=1234 --pin=1234

# Run tests
go test ./x509 -v

# Verify results
pkcs11-tool --module /usr/lib/softhsm/libsofthsm2.so --token-label mine --id 1 --label root-ca --pin 1234 --list-objects
openssl x509 -in x509/factory_ca.pem -text

# clean env (before re-running the test, otherwise it fails)
pkcs11-tool --module /usr/lib/softhsm/libsofthsm2.so --token-label mine --id 1 --label root-ca --pin 1234 --delete-object --type pubkey
pkcs11-tool --module /usr/lib/softhsm/libsofthsm2.so --token-label mine --id 1 --label root-ca --pin 1234 --delete-object --type privkey
rm -f x509/factory_ca.pem

@vkhoroz
Copy link
Member Author

vkhoroz commented Sep 27, 2023

Test produced a valid certificate based on softhsm keys for me locally:

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            fd:84:6b:59:3f:96:64:98:62:2b:e9:bc:01:a7:4c:d4:92:78:08:76
        Signature Algorithm: ecdsa-with-SHA256
        Issuer: CN = Factory-CA, OU = test
        Validity
            Not Before: Sep 27 19:04:53 2023 GMT
            Not After : Sep 27 19:04:53 2043 GMT
        Subject: CN = Factory-CA, OU = test
        Subject Public Key Info:
            Public Key Algorithm: id-ecPublicKey
                Public-Key: (256 bit)
                pub:
                    04:63:7d:93:1c:a9:99:50:68:db:da:79:3c:f2:1f:
                    ca:72:6c:20:da:64:61:48:94:33:f5:3c:a0:ca:81:
                    f0:d7:11:6a:87:50:05:1d:21:c5:6c:2e:30:b4:8c:
                    b7:ba:b8:74:fe:ab:6a:9f:2f:5b:57:e4:f8:50:25:
                    23:49:bc:0f:af
                ASN1 OID: prime256v1
                NIST CURVE: P-256
        X509v3 extensions:
            X509v3 Key Usage: critical
                Certificate Sign
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:TRUE
            X509v3 Subject Key Identifier: 
                90:0D:83:D7:F3:98:EC:D2:A9:5F:8B:35:83:1E:A7:F3:5D:C6:83:E9
    Signature Algorithm: ecdsa-with-SHA256
    Signature Value:
        30:44:02:20:25:77:68:89:d2:d1:6e:ad:42:9d:54:b3:1f:26:
        db:b5:fa:af:20:41:f9:94:77:70:a3:b1:93:b4:7f:65:bf:db:
        02:20:37:41:d6:ff:de:03:51:41:cd:af:16:75:80:e5:c6:67:
        7d:a7:3a:19:f7:c4:d6:34:c0:aa:20:7e:90:f5:f9:f9
-----BEGIN CERTIFICATE-----
MIIBqzCCAVKgAwIBAgIVAP2Ea1k/lmSYYivpvAGnTNSSeAh2MAoGCCqGSM49BAMC
MCQxEzARBgNVBAMMCkZhY3RvcnktQ0ExDTALBgNVBAsMBHRlc3QwHhcNMjMwOTI3
MTkwNDUzWhcNNDMwOTI3MTkwNDUzWjAkMRMwEQYDVQQDDApGYWN0b3J5LUNBMQ0w
CwYDVQQLDAR0ZXN0MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEY32THKmZUGjb
2nk88h/Kcmwg2mRhSJQz9TygyoHw1xFqh1AFHSHFbC4wtIy3urh0/qtqny9bV+T4
UCUjSbwPr6NhMF8wDgYDVR0PAQH/BAQDAgIEMB0GA1UdJQQWMBQGCCsGAQUFBwMB
BggrBgEFBQcDAjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBSQDYPX85js0qlf
izWDHqfzXcaD6TAKBggqhkjOPQQDAgNHADBEAiAld2iJ0tFurUKdVLMfJtu1+q8g
QfmUd3CjsZO0f2W/2wIgN0HW/94DUUHNrxZ1gOXGZ32nOhn3xNY0wKogfpD1+fk=
-----END CERTIFICATE-----

Public Key Object; EC  EC_POINT 256 bits
  EC_POINT:   044104637d931ca9995068dbda793cf21fca726c20da6461489433f53ca0ca81f0d7116a8750051d21c56c2e30b48cb7bab874feab6a9f2f5b57e4f850252349bc0faf
  EC_PARAMS:  06082a8648ce3d030107
  label:      root-ca
  ID:         01
  Usage:      encrypt, verify, wrap, derive
  Access:     local
Private Key Object; EC
  label:      root-ca
  ID:         01
  Usage:      decrypt, sign, unwrap, derive
  Access:     sensitive, always sensitive, never extractable, local

@vkhoroz
Copy link
Member Author

vkhoroz commented Sep 27, 2023

@mike-sul @doanac @detsch
If you like this idea, I will elaborate on it to make it fault-tolerant.

@mike-sul
Copy link
Contributor

The tool is available for Linux, Windows, and Darwin; a user can also build from source.

If the tool is available on all required platforms then it seems like a good way forward. I thought about building our own tool around pkcs11/openssl, since suhc the tool already exists then why not.

@vkhoroz
Copy link
Member Author

vkhoroz commented Sep 28, 2023

I thought about building our own tool around pkcs11/openssl, since suhc the tool already exists then why not.

What I find even better in this solution is that we don't need to wrap openssl.
We still use the Golang native X509 implementation, but offload a tiny fraction of ECDSA key generation and signing to pkcs11-tool to convey the crypto functions on the HSM.

Not as good as directly talking to the vendor specific PKCS11 library, but saves us a lot of cross-compilation headache.
At the end, from phylosophical perspective: talking to shared object versus talking to executable are ideologically almost equal.
Either way, we remove at least 2 dependencies:

  • (Most important) Bash, so that we no longer can be acused in remote script execution.
  • (Less important) OpenSSL, so that we are not subject to related CVEs.

@mike-sul
Copy link
Contributor

mike-sul commented Sep 28, 2023

Either way, we remove at least 2 dependencies:

* (Most important) Bash, so that we no longer can be acused in remote script execution.

* (Less important) OpenSSL, so that we are not subject to related CVEs.

pkcs11-tool depends on libcrypto which usually comes with openssl.

@detsch
Copy link
Member

detsch commented Sep 28, 2023

@mike-sul @doanac @detsch If you like this idea, I will elaborate on it to make it fault-tolerant.

It seems to me this is the way to go. Code organization looks good too. I can help with some testing if you need.

@vkhoroz
Copy link
Member Author

vkhoroz commented Sep 28, 2023

@doanac @mike-sul @detsch @StealthyCoder @camilamacedo86

Full testing will be done tomorrow, but this is kind of ready to go.

@vkhoroz vkhoroz changed the title PoC: pkcs11-tool based HSM support WIP: pkcs11-tool based HSM support Sep 28, 2023
Copy link
Member

@doanac doanac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any use keeping the bashpki? I don't think we'll ever test or ship it. So its basically just us carrying around unsupported code, right?

x509/common.go Outdated Show resolved Hide resolved
Makefile Show resolved Hide resolved
@mike-sul
Copy link
Contributor

mike-sul commented Sep 29, 2023

Is this the correct command to store key/cert in HSM?

fioctl keys ca create $PWD/pki  --hsm-module /usr/lib/softhsm/libsofthsm2.so --hsm-pin 1234 --hsm-token-label mine
Creating offline root CA for Factory
Signing Foundries TLS CSR
Signing Foundries CSR for online use
Creating local device CA
Uploading signed certs to Foundries

I tried it, no any errors, however nothing is stored in the HSM, all items are in file system.

@vkhoroz
Copy link
Member Author

vkhoroz commented Sep 29, 2023

Added unit tests and all fixes I found while testing.
I will continue end-2-end testing on Monday.

@vkhoroz
Copy link
Member Author

vkhoroz commented Sep 29, 2023

@mike-sul there was a bug introduced while I was shuffling code around.
Because of that HSM was always disabled.
It is fixed in the latest push.

@vkhoroz
Copy link
Member Author

vkhoroz commented Sep 29, 2023

@doanac while testing this stuff I found out quite a few discrepancies in the Golang based PKI implementation.
Surprisingly enough, the Bash based version was more accurate, even for the HSM use case; although it also had subtle issues.

That said, after I added unit tests which apply exactly the same test flow to all 3 PKI implementations, I came to the conclusion that it is worth to keep all of them.
A reason is this: when we introduce some new code which breaks something, we could use test results to identify which version to blame based on the quorum principle.
The PKI is complex, especially with the HSM support; so IMHO having only one implementation is like going blind in the wild.

Please, see the example test results:
https://github.com/foundriesio/fioctl/actions/runs/6355711636/job/17264165068?pr=278

@mike-sul
Copy link
Contributor

mike-sul commented Oct 2, 2023

I suppose the question whether to allow storing both factory and local/offline CA keys in HSM, or just one of them in out of the scope of the given PR.

@mike-sul
Copy link
Contributor

mike-sul commented Oct 2, 2023

I suppose getting of the factory CA key from HSM to sign a new device offline CA csr is out of the scope of this PR, and the same for device CSRs signing by an offline CA key stored in HSM?

@mike-sul
Copy link
Contributor

mike-sul commented Oct 2, 2023

The happy path works on my host, the factory root CA priv key is stored in the HSM.

pkcs11-tool --module /usr/lib/softhsm/libsofthsm2.so --token-label mine --id 1 --label root-ca --pin 1234 --list-objects
Public Key Object; EC  EC_POINT 256 bits
  EC_POINT:   04410494dea3234d6f29920e63fc9d6815b2053dd9552574017ae84bcc1e38c4bcd032379c749d9f1af99c6fe583db653c05a799dc65ff17b561a9fe9b40b479c9613a
  EC_PARAMS:  06082a8648ce3d030107
  label:      root-ca
  ID:         01
  Usage:      encrypt, verify, wrap, derive
  Access:     local
Private Key Object; EC
  label:      root-ca
  ID:         01
  Usage:      decrypt, sign, unwrap, derive
  Access:     sensitive, always sensitive, never extractable, local

@mike-sul
Copy link
Contributor

mike-sul commented Oct 2, 2023

When I ran the fioctl keys ca create --hsm-module... for the first time I had my HSM miss-configured. As result, the certs/keys were created and the secret was stored on the backend, but the factory root CA priv key wasn't stored anywhere due to the write to HSM error.
If a customer run into similar issue then they will have to contact our support to reset their factory PKI in order to re-rerun the fioctl keys ca create --hsm-module... command.

I am not sure that this potential issue should be discussed in the scope of the given PR at all, just wanna mention for the future consideration.

@vkhoroz
Copy link
Member Author

vkhoroz commented Oct 2, 2023

@mike-sul

I suppose $XYZ is out of scope.

Yes, the only things in scope are:

  • Make sure that Bash-based implementation is fixed for the HSM use case.
  • Achieve a feature parity between Golang-based and Bash-based implementations.
  • Enable the Golang based implementation wrapping pkcs11-tool by default (by the build tag).
  • I also voluntarily added unit tests to cross-verify that both implementations work for the happy path.

All other improvements need discussion, and risk growing this PR indefinitely.

secret was stored on the backend, but the factory root CA priv key wasn't stored anywhere due to the write to HSM error

This sounds like a bug, the command should have failed.
I will test for this scenario tomorrow.

Makefile Show resolved Hide resolved
Copy link
Contributor

@mike-sul mike-sul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

This is a quite big change. I've reviewed it without detailed analyses of the PKI keys/certs generation details for each implementation types.

I suppose we assume that the unit tests guarantee that the keys/certs are correct.

In any case, I believe it makes sense to test happy path e2e for the default case. I've just checked it partially. Testing of adding a new offline device CA and signing device CSR with the offline CA is gonna take some time.

We have already tested this implementation on Windows in the field.
Also tested it on Linux using a local build.

Signed-off-by: Volodymyr Khoroz <[email protected]>
This small discrepancy was found by the newly added unit tests.

Signed-off-by: Volodymyr Khoroz <[email protected]>
Although this is not critical for the CSR which is warranted to be properly formatted by our API,
it is really a problem for the certificate file loaded from the user filesystem.

Signed-off-by: Volodymyr Khoroz <[email protected]>
This might seem contradicting, but allows for some benefits:
- No shell script files need to be created; they are piped into the Bash in-memory.
- All temporary csr files are created/removed in /tmp; only necessary files stay in PKI directory.
- Eventually, we can remove these Bash scripts from the API code base.
- Interested users have direct OS access to these scripts allowing them to build their custom PKI solution.

I copied these scripts verbatim from the API and only made minimal changes to make them work:
- substituted template parameters with script parameters.
- removed $here variable, as scripts are executed inside the certs dir.
- removed sign-csr execution from create_device_ca, and instead chained these 2 scripts inside Golang.

The resulting implementation was tested in MEDS factory.
To test, build the fioctl as this:
```
CGO_ENABLED=0 go build -ldflags "-X=github.com/foundriesio/fioctl/subcommands/version.Commit=v0.36-14-g2cd3815+dirty" -tags bashpki -o bin/fioctl-linux-amd64-bash main.go
```

After that, the following command creates PKI using Bash scripts:
```
bin/fioctl-linux-amd64-bash -c fioed-bash.yml keys ca create pki/fioed-bash --local-ca --online-ca
```

Maybe, later on we will decide to strip this.
For now, I prefer to keep this as a backup solution.

Signed-off-by: Volodymyr Khoroz <[email protected]>
This is the missing part which rendered the Bash-based PKI support inoperable.
Now all the pieces seem to be fine.

Signed-off-by: Volodymyr Khoroz <[email protected]>
This is a preparation for the HSM support:
- Use crypto.PublicKey instead of any.
- User crypto.Signer instead of *ecdsa.PrivateKey.

Signed-off-by: Volodymyr Khoroz <[email protected]>
…module

This makes it easier to introduce the HSM support

Signed-off-by: Volodymyr Khoroz <[email protected]>
This allows to have different key storage implementations.
This PR immediately adds the filesystem storage for Golang implementation.
It also adds both filesystem and HSM storage for Bash implementation.
The pkcs11-tool based implementation is dummy and will be added in the next commit.

Signed-off-by: Volodymyr Khoroz <[email protected]>
This allows to decouple the PKCS11 vendor library loading from the Fioctl process.
As a result, we can safely continue building static binaries.

This fully replaces all features of our Bash based PKI implementation with the Golang native approach.

Signed-off-by: Volodymyr Khoroz <[email protected]>
This requires dynamic library linking of Libc, hence, not compatible with our static releases.
But, it was fairly simple to implement, so can be interesting for some users.

Signed-off-by: Volodymyr Khoroz <[email protected]>
This fixes two problems:
- A part of PKI files were created by x509 modules, and part by ca_create.
  This is not only non-uniform, but also makes writing tests harder.
  A small nuissance also goes from different error messages.
- Because of the above, different files got different permissions.
  This was true for both Bash based on Golang based PKI implementation.

This commit unifies file writing method:
- It is the x509 package obligation to write files as necessary.
- All PKI files receive read-only (0400) rights to protect them from unintended access.

Signed-off-by: Volodymyr Khoroz <[email protected]>
The default value we provided for this parameter is kind of misleading.
Furthermore, the user might forget to specify a correct label, resulting into misconfigured HSM module.
It is safer to require a user to provide the token label explicitly.

Signed-off-by: Volodymyr Khoroz <[email protected]>
These are the first tests for Fioctl, and not something we usually do.
But, I was rather worried about switching from Bash based to Golang based PKI.
Especially, given a fact that we need to also migrate the HSM support which was broken for ages.

I also worry that it is way too easy to break a feature which is used so rarely, albeit being so important.
So, I would prefer these tests to be run as a part of our CI, meaning that this code is being executed regularly.

This first commit adds non-HSM tests.
As you might notice, it helped me to find a couple of bugs in the Golang based PKI.
It also helped verifying that the new Golang based PKI is identical to our legacy Bash based PKI.
Finally, that both implementations work.

Signed-off-by: Volodymyr Khoroz <[email protected]>
This test allowed to find few more fixes.

It requires the following packages in order to run:
- openssl, opensc, softhsm2, libengine-pkcs11-openssl

It does all the necessary initialization and cleanup.

Signed-off-by: Volodymyr Khoroz <[email protected]>
Make sure to verify all PKI implementations at CI time

Signed-off-by: Volodymyr Khoroz <[email protected]>
@vkhoroz
Copy link
Member Author

vkhoroz commented Oct 3, 2023

Rebased on main to pull all linter/testing enhancements and cleaned the commit history.

Will re-test this whole thing tomorrow.

This is a small yet tempting improvement to make a smaller number of calls to the pkcs11-tool.
It improved the test run time from 0.4 to 0.2 seconds (a negligible diff which simply makes me feel more professional).

Signed-off-by: Volodymyr Khoroz <[email protected]>
x509/bash.go Show resolved Hide resolved
@mike-sul
Copy link
Contributor

mike-sul commented Oct 4, 2023

LGTM.

I think it's important to test happy path for all PKI usage use-cases.

@vkhoroz
Copy link
Member Author

vkhoroz commented Oct 5, 2023

@doanac @mike-sul please, give it a final cut.

I tested the following happy-path scenaria:

  • Take PKI offline without HSM:

    • Verify all keys and certs exist and are valid;
    • Verify that I can register the device via API (using online CA);
    • Verify that I can register the device with registration-ref (using offline CA);
    • Verify that both devices connect well to device-gateway (the second one is auto-registered), and ostree-proxy.
      • For device gateway, the TUF metadata loading via aktualizr-lite was verified, not the full update.
      • For ostree-proxy only the connection using openssl s_client was verified.
  • Take PKI offline with HSM (with SoftHSM2):

    • Verify all keys and certs exist (except factory_ca.key) and are valid;
    • Verify that the factory CA key is stored in the HSM;
    • Verify that I can register the device via API (using online CA);
    • Verify that I can register the device with registration-ref (using offline CA);
    • Verify that both devices connect well to device-gateway and ostree-proxy.

Please, let me know if you have any other test scenario in mind.

@vkhoroz vkhoroz merged commit 92cd99b into main Oct 5, 2023
7 checks passed
@vkhoroz vkhoroz deleted the vkhoroz-pkcs11-tool branch October 5, 2023 18:22
@vkhoroz vkhoroz changed the title WIP: pkcs11-tool based HSM support Feature: pkcs11-tool based HSM support for factory Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants