-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes for making PDO work in SGX HW-mode #477
Fixes for making PDO work in SGX HW-mode #477
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally good. biggest issue is that you are making the makefile a lot more complicated than it needs to be. just add a "test_hw" (or something) target that explicitly sets the target images. note that you probably need to also change the image name to reflect that hw status (for the images that change)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me (even after Mic's counter-argument to my counter-argument to his ;-)). Also tests seem to run fine.
PS: On non-FLC machines, e.g., SKL, the upstreamed in-kernel driver do not work and the out-of-free kernel in binary distribution or main branch in https://github.com/intel/linux-sgx-driver do not compile anymore. However, applying the patch from https://github.com/intel/linux-sgx-driver/pull/151/files made driver compile and everything work with (on host) sgx 2.23
35b267a
to
132ba81
Compare
I still have a number of conceptual issues with this. It feels like we are designing around the code rather than figuring out the right thing to do. First, we still need all the environment variables for host build. For docker builds we are trying to remove (effectively) ALL of the pre-established environment variables. So... removing SGX_MODE from common_config is not a solution. And its completely orthogonal to setting the variable in the makefile for docker. Second, we have an outstanding question for where to put the sgx keys and when to remove them. Who needs the enclave signing keys? When are they necessary. If they are truly ephemeral, then, as we discussed, I don't care where they are put so long as they are removed afterwards. If we need those keys, then you need to think more about where they are put and when they are put there. And we are NOT constrained by how we currently build the enclave (i.e. its OK to change the build flow to make this work "right"). To this end, I would really like to see a "document" that covers the process more thoroughly. |
372c592
to
5883065
Compare
I revised the PR.
|
5883065
to
1f230a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After figuring out that i must define PDO_ENCLAVE_CODE_SIGN_PEM
and PDO_SGX_KEY_ROOT
for both hw and sim test (which previously worked properly config-less), "unforgetting" that docker (stupidely) does not take proxy environment variables but has it's own setting in ~/.docker/settings.json
) and also realizing that EPID works only on icelake client (i.e., ICL) but not icelake server (i.e., ICX), contrary to what i was told, i could successfully run SIM & HW mode on a KBL NUC outside of FW and SIM mode on an ICX machine behind FW for SIM. Find my main concerns as file annotations.
docker/Makefile
Outdated
@@ -112,6 +121,12 @@ stop_client : | |||
# performance requirements are relatively low. | |||
# ----------------------------------------------------------------- | |||
repository : | |||
# if an enclave signing key is available on the host, copy that under build/keys in the repo | |||
# Note: the docker build (see PDO_ENCLAVE_CODE_SIGN_PEM in environment.sh) expects the key there | |||
[ ! -e ${PDO_ENCLAVE_CODE_SIGN_PEM} ] ||\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test also whether PDO_ENCLAVE_CODE_SIGN_PEM is actually defined at all! Note in previous version you did not have to define any env variable to build docker ....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two comments here:
- I have added some checks to fail more gracefully and see what's going on
- however, according to our documentation,
build/common-config.sh
should be sourced first when developing in PDO. Hence,
a. PDO_ENCLAVE_CODE_SIGN_PEM is always defined (either by the user, or by the script)
b. it would be more effective to check whether the script was sourced -- I'm open to suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. however, according to our documentation,
build/common-config.sh
should be sourced first when developing in PDO. Hence,
This is not really required for docker (and from the docker docu also not obvious). That said, couldn't we just source that file in the docker makefile after having set SGX_MODE to get exactly what we want: defaults for somebody who runs docker and hasn't defined anything and the desired specific values otherwise?
2. a. PDO_ENCLAVE_CODE_SIGN_PEM is always defined (either by the user, or by the script)
b. it would be more effective to check whether the script was sourced -- I'm open to suggestions
Given that we document that if not defined we always create one, your current else from if [ ! -z "${PDO_ENCLAVE_CODE_SIGN_PEM}" ]; then ...
seems the right way. However, iff it is defined, why do we insist that it has to be in ${PDO_SGX_KEY_ROOT} (outside of docker) and don't just copy it to xfer directly? If we really want to have separate definition for PDO_SGX_KEY_ROOT and PDO_ENCLAVE_CODE_SIGN_PEM i don't think we should "merge" them ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we insist that it has to be in ${PDO_SGX_KEY_ROOT} (outside of docker) and don't just copy it to xfer directly?
(Good question!)
From previous discussions with Mic, we noticed two things:
- the enclave code signing key right now is only needed at build time, and if one is available we should pass it into the build
- the build happens while the docker image is created, and we cannot mount the xfer folder at that time.
For these reasons, we decided to make ${PDO_SGX_KEY_ROOT} (set to its default value, and thus within the repo!) as the destination folder for the key. In this way, the key copied/cloned together with the repo in the docker container. Then, at build time within the container, the common-config is sourced (setting the default values), and the build will find the key in the right place.
Since the built image is meant to be distributed to users, in the future we will have to remove the key material. We can accomplish that by removing the cloned repo (which includes the key).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, i see. But might be worth mentioning above as a comment in that code fragment as not so obvious?
docker/README.md
Outdated
Inside the `pdo_services` images, the `SGX_MODE` environment variable | ||
can help distinguish the build type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As i think also Mic has suggested i think we should "type" the images names, not doing so is asking for trouble; in particular given that contrary to the previous approach of relying on external variable definitino of SGX_MODE (which i preferred) we have now multiple targets which "conflict" as far as sgx-mode is concerned which increases opportunities for mix-ups ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was previous discussion here https://github.com/hyperledger-labs/private-data-objects/pull/477/files#r1527009436.
My earlier attempt unfortunately was not promising because we have SGX_MODE in multiple images and the images are interdependent. Hence, it's difficult to do the typing without modifying makefile and yaml and docker files.
However, it is possible that SGX_MODE is only necessary in the pdo-services image. So it could be removed from the other images, and and then we simply add a new type for pdo-services in hw mode.
Conclusion:
- the BKM here is to always start with a fresh clean development environment
- mix-n-match can be a concern
- solving/mitigating this requires further work -- IMO this is orthogonal to this PR, and I'm ok to do it separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, it is possible that SGX_MODE is only necessary in the pdo-services image.
Previously that was the case. However, somebody mentioned that the sdk (or, more likely, sgxssl) would be different based on mode (which on further reflection seems though rather surprising and strange).
In any case, base service is mode-agnostic, then we could (should) just remove the ENV in the corresponding dockerfile so the status is not persisted
So it could be removed from the other images, and and then we simply add a new type for pdo-services in hw mode.
That was my thinking: we just type pdo_services with the mode. Initially i thought that might also apply to ccf if we would run that in HW mode but according to Mic CCF always builds both version and so only the runtime SGX_MODE variable matters.
- solving/mitigating this requires further work -- IMO this is orthogonal to this PR, and I'm ok to do it separately.
Why is that orthogonal? Given that you have already sgx_build and sgx_run targets, seems also easy to type that image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that orthogonal? Given that you have already sgx_build and sgx_run targets, seems also easy to type that image?
Because I was working on it here and it was too complicated to solve it in this PR.
Also, I have not explored the simplifications that we have discussed above -- if those work, the modifications are way easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave it a try and the modifications were fairly easy -- I have added a commit to type the sgx image.
docker/README.md
Outdated
This updated command allows to trigger the registration step right before | ||
starting the services. The policy registration must happen before enclaves are | ||
registered (or any enclave registration will fail). | ||
|
||
Finally, the _same_ SGX collateral must be made available to all service containers. | ||
At enclave registration time, this will allow the eservice to generate the right | ||
quote (and attestation verification report) that meets the attestation policy | ||
originally registered with the PDO Transaction Processor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should say here also where this information is found and also should mention (and use!) the defaults of build/keys/sgx_mode_{sim|hw)
. Besides not supporting any default and requiring PDO_SGX_KEY_ROOT
to be defined when running make docker, having to define it explicitly is also conceptually inconsistent as this variable is (at least as far as our existing default directories goes) dependent on SGX_MODE
. So we are using the worst of the combination: mode is now managed by targets yet we still need mode-dependent external env variables, not allowing for a no-config version of make -C docker
(as previously was possible; admittedly previously only for SGX_MODE=sim but now as we have different target we could retain that also for SGX_MODE=hw)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should say here also where this information is found and also should mention (and use!) the defaults of build/keys/sgx_mode_{sim|hw).
Correct, and that is described under docs/install.md
.
Besides not supporting any default and requiring PDO_SGX_KEY_ROOT to be defined when running make docker, having to define it explicitly is also conceptually inconsistent as this variable is (at least as far as our existing default directories goes) dependent on SGX_MODE.
Let me refer again to the documentation in docs/host_install.md
, which indicates to source build/common-config.sh
first. If PDO_SGX_KEY_ROOT
is not defined, this steps creates a default value for it as you mentioned.
So we are using the worst of the combination: mode is now managed by targets yet we still need mode-dependent external env variables, not allowing for a no-config version of make -C docker
This is what was discussed -- namely to have specific SGX targets (which clearly manage the mode).
Again, the "external" variables are not necessary if build/common-config.sh
is sourced first -- as it creates the default values.
I agree that the combination can be a problem when switching on the fly between SIM and HW mode development.
For this reason the BKM is to always start with fresh clean environment.
I'm open to suggestions to simplify this for developers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
. I'm open to suggestions to simplify this for developers.
Well, isn't typing in the image names exactly a simple way to simplify it & make it more robust?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably is -- but image typing brings us back to the other thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a commit to type the service image built for sgx.
eservice/bin/register-with-ledger.sh
Outdated
@@ -71,9 +71,11 @@ function Store { | |||
--spid ${SPID} \ | |||
--save ${eservice_enclave_info_file} \ | |||
--loglevel warn \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make this dependent on PDO_DEBUG_LEVEL
to make trouble-shooting easier (that said, the recent addition below of log-file output to screen certainly also helps in debugging). That said, i currently have that change also in the (updated) PR #480, so could be done there ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point -- ok with your update, I'll leave it as is here.
496a400
to
45b2fac
Compare
Thanks @g2flyer for reviewing and testing this. Regarding the enclave signing key, if one is defined, it will be used and Regarding the SGX collateral, the makefile will look into |
cp ${XFER_DIR}/services/keys/sgx/* ${PDO_SGX_KEY_ROOT} | ||
# refresh the environment variables (necessary for SGX-related ones) | ||
source /project/pdo/tools/environment.sh | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, why are we duplicating this? we can do better with the logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No duplication anymore.
905f954
to
ab3608d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With your changes from yesterday night i'm quite happy from my side: typed images seemed to work nicely and in either hw or sim case i don't have to specify any environment variables anymore with docker picking up the defaults and docker test works fine.
dd2a5a5
to
2977dab
Compare
A new docker build target is added to build the services to run in SGX. The resulting image is clearly typed/tagged as pdo_service_sgx to distinguish it from non-SGX builds. Also, base images are now SGX-MODE agnostic. 3 new scripts are added for handling SGX keys in docker builds. First, IAS certificates are retrieved by the host (not by the docker container) and transferred to the container through the "repository" before the build starts. Second, the enclave signing key is similarly checked for on (or directly passed by) the host, and transferred to the container through the "repository" before the build starts. Third, at run time, the necessary SGX keys are transferred from the host to the container through the xfer folder. The testing support is updated to pass the required SGX volumes and devices. Also, current small hacks to work behind a proxy are extended to support also the SGX-based image behind a proxy. The environment variables related to SPID, SPID_API_KEY and proxy are removed. All the material is passed to the build or at runtime through files, and proxy configuration happens through docker config settings (as usual) -- without forcing specific configurations, except for the small hack above. As a result the eservice configuration also gets simplified by reducing and merging the enclave.toml with the eservice.toml. Additional clean up will follow in later icommits. The documentation is updated to provide details on using and testing with SGX-based builds. Signed-off-by: Bruno Vavala <[email protected]>
Remove the expand-config script as it's unused. Remove verify-pre-conf as it is now obsolete. Update common config to remove unused/unecessary variables. Remove the enclave.toml files and other obsolete toml. Update the documentation to remove unused variables. Signed-off-by: Bruno Vavala <[email protected]>
The enclave only performs length check on the SPID, but it does not check the hex format. Previously this was performed on shell variables. This makes the enclave check more robust. Finally, additional naming updates are pushed to align with current conventions. Signed-off-by: Bruno Vavala <[email protected]>
57ecc18
to
22f4ec9
Compare
Great work, @bvavala |
This PR continues the work in #463 to enable tests in SGX HW-mode, and makes the following contributions.