Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PKCS11 provider connectivity issue #607

Open
1 of 3 tasks
ionut-arm opened this issue May 12, 2022 · 2 comments
Open
1 of 3 tasks

PKCS11 provider connectivity issue #607

ionut-arm opened this issue May 12, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@ionut-arm
Copy link
Member

ionut-arm commented May 12, 2022

The issue

Disconnecting and reconnecting a pluggable PKCS11 token leads to the PKCS11 provider being inaccessible. To reproduce the issue:

  • start the Parsec service with the PKCS11 provider and a pluggable hardware backend (e.g. a NitroKey HSM)
  • unplug the hardware backend
  • attempt to create a key using the parsec-tool, you'll get:
[INFO ] Creating RSA encryption key...
[ERROR] Subcommand failed: a hardware failure was detected (ParsecClientError(Service(PsaErrorHardwareFailure)))

This is as expected.

  • plug the hardware back in
  • attempt to create a key again, you'll get
INFO ] Creating RSA encryption key...
[ERROR] Subcommand failed: there was a communication failure inside the implementation (ParsecClientError(Service(PsaErrorCommunicationFailure)))

This error is NOT expected. The service should continue to operate correctly in this case.

Solution

There are still bits of information missing which will require some more investigation. I'm hoping to get a way to reproduce this using SoftHSM2.

The ideal solution would be for us to simply re-establish a functional connection to the hardware token when we detect that the token has been unplugged and plugged back in. The actual solution will depend on how reliably we can tell whether this has happened and on what options we identify for re-establishing that connection in a clean way.

Outstanding questions

  • What exact error is received after the device is plugged back in? Is this error sufficiently distinctive to identify this exact cause?
  • Do we attempt to re-establish the connection at provider level (i.e., initializing a new PKCS11 context), or do we restart the whole service somehow? Does it matter if we have other providers involved?
  • If we re-establish the connection at provider level, do we retry the operation that made us realize something's broken? Or just send back something akin to "retry later"?

This is a variant of the more generic approach discussed in #607

@ionut-arm ionut-arm added the bug Something isn't working label May 12, 2022
@anta5010
Copy link
Collaborator

anta5010 commented May 12, 2022

With RUST_LOG=trace after re-inserting a USB HSM module:

parsec-tool:

# RUST_LOG=trace parsec-tool -p 2 create-rsa-key -k anta-11-new
[DEBUG] Parsec BasicClient created with implicit provider "Mbed Crypto provider" and authentication data "UnixPeerCredentials"
[INFO ] Creating RSA encryption key...
[DEBUG] Running getuid
[ERROR] Subcommand failed: there was a communication failure inside the implementation (ParsecClientError(Service(PsaErrorCommunicationFailure)))

parsec:

[TRACE parsec_service::front::front_end] handle_request ingress
[INFO  parsec_service::front::front_end] New request received from application name "0"
[TRACE parsec_service::back::dispatcher] dispatch_request ingress
[TRACE parsec_service::back::backend_handler] execute_request ingress
[TRACE parsec_service::providers::pkcs11] psa_generate_key ingress
[ERROR parsec_service::providers::pkcs11::utils] Error converted to PsaErrorCommunicationFailure; Error: Some horrible, unrecoverable error has occurred.  In the worst case, it is possible that the function only partially succeeded, and that the computer and/or token is in an inconsistent state.
[TRACE parsec_service::back::dispatcher] execute_request egress
[TRACE parsec_service::front::front_end] dispatch_request egress
[INFO  parsec_service::front::front_end] Response for application name "0" sent back
[TRACE parsec] handle_request egress

@ionut-arm
Copy link
Member Author

ionut-arm commented May 12, 2022

From the spec:

5.1.1 Universal Cryptoki function return values

Any Cryptoki function can return any of the following values:

· CKR_GENERAL_ERROR: Some horrible, unrecoverable error has occurred. In the worst case, it is possible that the function only partially succeeded, and that the computer and/or token is in an inconsistent state.

· CKR_HOST_MEMORY: The computer that the Cryptoki library is running on has insufficient memory to perform the requested function.

· CKR_FUNCTION_FAILED: The requested function could not be performed, but detailed information about why not is not available in this error return. If the failed function uses a session, it is possible that the CK_SESSION_INFO structure that can be obtained by calling C_GetSessionInfo will hold useful information about what happened in its ulDeviceError field. In any event, although the function call failed, the situation is not necessarily totally hopeless, as it is likely to be when CKR_GENERAL_ERROR is returned. Depending on what the root cause of the error actually was, it is possible that an attempt to make the exact same function call again would succeed.

· CKR_OK: The function executed successfully. Technically, CKR_OK is not quite a “universal” return value; in particular, the legacy functions C_GetFunctionStatus and C_CancelFunction (see Section 5.15) cannot return CKR_OK.

The relative priorities of these errors are in the order listed above, e.g., if either of CKR_GENERAL_ERROR or CKR_HOST_MEMORY would be an appropriate error return, then CKR_GENERAL_ERROR should be returned.

I think it's fair to say that if we get CKR_GENERAL_ERROR we should try to reset the connection, no matter the cause. Or bomb out (if we've already tried to reset once).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants