diff --git a/Documentation/devel/encfiles.rst b/Documentation/devel/encfiles.rst new file mode 100644 index 0000000000..b90753c285 --- /dev/null +++ b/Documentation/devel/encfiles.rst @@ -0,0 +1,525 @@ +Encrypted Files in Gramine +========================== + +.. note :: + This is a highly technical document intended for cryptography practitioners. + + This is a living document. The last major update happened in **October 2024** + and closely corresponds to Gramine v1.8. + +With Gramine's :ref:`encrypted-files` feature, Gramine transparently encrypts +and decrypts files when the application writes and reads the files, +respectively. To use the encrypted files feature, the user must place integrity- +and confidentiality-sensitive files (or whole directories) under the +``encrypted`` FS mount in the Gramine manifest. New files created in the +``encrypted`` FS mount are automatically treated as encrypted. The format used +for encrypted and integrity-protected files is borrowed from the "protected +files" feature of Intel SGX SDK (see the corresponding section in `Intel SGX +Developer Reference manual +`__). + +Each encrypted file is encrypted separately, i.e., Gramine employs file-level +encryption and not block-level encryption. Each ``encrypted`` FS mount has a +separate encryption key (more precisely, this is the key-derivation key or KDK). +More information on the usage of encrypted files can be found in the +:ref:`encrypted-files` manifest syntax. + +The feature was previously called "protected files" or "protected FS", same as +in Intel SGX SDK. These legacy names may still be found in Gramine codebase. +A short introduction to the "protected files" feature as implemented in Intel +SGX SDK was also published in `this old blog post +`__. + +Encrypted files are primarily used with TEEs (Trusted Execution Environments) +like Intel SGX, i.e., with :program:`gramine-sgx`. However, for debugging +purposes, encrypted files are also functional in :program:`gramine-direct`. + +Security properties of encrypted files +-------------------------------------- + +The current implementation of encrypted files in Gramine provides the following +security properties: + +- **Confidentiality of user data**: all user data is encrypted and then written + to untrusted host storage; this prevents user data leakage. +- **Integrity of user data**: all user data is read from disk and decrypted, + with the Message Authentication Code (MAC) verified to detect any data + tampering. +- **Matching of file name**: when opening an existing file, the metadata of the + to-be-opened file is checked to ensure that the name of the file when created + is the same as the name given to the open operation. + +The current implementation does *not* protect against the following attacks: + +- **Rollback/replay attacks after file close**. The user cannot detect whether + they have opened an old (but authenticated) version of a file. In other words, + Gramine does not guarantee the freshness of user data in the file after this + file was closed. Note that while the file is opened, the rollback/replay + attack is prevented (by always keeping the root hash of a Merkle tree over the + file in trusted enclave memory and checking the consistency during accesses, + see more details below). +- **Side-channel attacks**. Some file metadata is not confidentiality-protected, + such as file name, file size, access time, access patterns (e.g., which blocks + are read/written). This lack of confidentiality protection of metadata could + be used by attackers to gain sensitive information. See also + https://gramine.readthedocs.io/en/stable/devel/features.html#file-systems for + additional discussions on file metadata. + +.. note :: + There is an effort to add rollback/replay attack protection in Gramine. See + the discussion in https://github.com/gramineproject/gramine/issues/1835. + +Encrypted Files subsystem in Gramine codebase +--------------------------------------------- + +Encrypted Files logic is implemented as a separate subsystem, only loosely +coupled with the rest of Gramine. Integration with the rest of Gramine code is +based on: + +- A set of public functions like ``pf_open()``, ``pf_read()``, etc. +- A set of callbacks like ``cb_read_f()``, ``cb_write_f()``, etc. + +There is a glue code that serves as a bridge between the Gramine-agnostic +Encrypted Files subsystem and the rest of Gramine. This glue code calls public +functions for high-level operations on encrypted files and registers callbacks +for low-level interactions with the host. E.g., the glue code calls +``pf_read()`` whenever the user application wants to read from the encrypted +file, then the encrypted-files logic performs crypto operations on +encrypted-file's chunks and periodically calls ``cb_read_f()`` to consume the +chunks from the host's storage. + +From the Gramine codebase perspective, the split is as follows: + +- Generic Encrypted Files code -- :file:`common/src/protected_files/` +- Glue code -- :file:`libos/src/fs/libos_fs_encrypted.c` +- Gramine FS code -- :file:`libos/src/fs/chroot/encrypted.c` + +There are several reasons for this decoupling: + +- Historical reason -- to ease the porting effort from Intel SGX SDK. +- Sharing of bug fixes -- if a bug is fixed in Intel SGX SDK, it can be easily + applied in Gramine, and vice versa. +- Reusability -- the encrypted-files code can be used as-is in stand-alone tools + like :program:`gramine-sgx-pf-crypt`. +- Crypto reviews -- the encrypted-files code is the only place that directly + uses crypto algorithms, which facilitates crypto/security review efforts. + +The application code is *not* aware of encrypted files. Applications treat +encrypted files just like regular files, e.g., apps open file descriptors (FDs), +duplicate them, perform I/O operations on files and then close the FDs. Gramine +intercepts such system calls, creates handles for FDs, consults the manifest +file to learn that these handles are encrypted-files' handles, attaches inodes +to them, and transforms regular I/O operations into encrypted-I/O operations. +Note that before working with a particular encrypted file, the encryption key of +its corresponding FS mount must be already provisioned. + +If Gramine detects tampering or integrity inconsistencies on an encrypted file, +Grmaine marks the file as corrupted and refuses any operations on this file. In +particular, the application's operations on the file will return ``-EACCES``. + +.. image:: ../img/encfiles/01_encfiles_datastructs.svg + :target: ../img/encfiles/01_encfiles_datastructs.svg + :alt: Figure: Relations between the app, the Gramine FS code, the Gramine glue code and the generic encrypted-files code. + +The diagram above shows the relations between the application, the Gramine FS +code, the Gramine glue code and the generic encrypted-files code. Here, the +``libos_encrypted_file`` data structure is hosted in the glue code, and the +``pf_context`` data structure is hosted in the generic encrypted-files code. +In addition to the standard file-system callbacks and PAL interfaces in Gramine, +the encrypted FS requires one additional configuration option for the +key-derivation key (KDK). The KDK is installed through Gramine interfaces into +the ``libos_encrypted_key`` field in the glue code which copies it into the +``kdk`` field in encrypted-files code. Also, the glue code opens a host file via +Gramine's PAL interfaces and saves the reference to it into ``pal_handle``, +which is copied into ``host_file_handle`` in encrypted-files code. With these +two fields, plus the set of registered callbacks, the encrypted-files code has +enough information to encrypt and decrypt files stored on the host's disk. + +Crypto used for encrypted files +------------------------------- + +- The current implementation of encrypted files uses AES-GCM with 128-bit key + size for encryption and MAC generation. Thus, all encryption keys are 16B in + size and all MACs are 16B in size. + +- Sub-keys are derived from the user-supplied KDK using the `NIST SP800-108 + `__ construction, with + the required PRF (Pseudorandom Function) instantiated by AES-128-CMAC. + +- Keys for node encryption (i.e., keys stored in the Merkle Hash Tree nodes aka + MHT nodes) are generated randomly using a cryptographically secure + pseudo-random number generator (CSPRNG); for x86-64, this CSPRNG is the RDRAND + instruction. + +- Initialization vectors (IVs) are always all-zeros. This is allowed because + each node-encryption key is generated randomly and is never re-used. + +- Additional authenticated data (AAD) is not used. + +- The crypto library used is mbedTLS, frequently updated by Gramine maintainers + to be the latest released version. + +Representation on host storage and in SGX enclave memory +-------------------------------------------------------- + +Encrypted files use a special format developed specifically for Intel SGX +usages. In the following, we distinguish between the representation of encrypted +files on host storage (untrusted) and the representation inside the SGX enclave +(trusted). + +An encrypted file is stored on the untrusted host storage in a file with the +same pathname, but augmented with additional metadata and split into 4KB chunks +(pages). Each chunk is also referred to as a "node". + +.. image:: ../img/encfiles/02_encfiles_representation.svg + :target: ../img/encfiles/02_encfiles_representation.svg + :alt: Figure: Representation of an encrypted file on host storage and inside the SGX enclave. + +An encrypted file is represented inside the SGX enclave as a set of interlinked +data structures and buffers. There is a main data struct ``pf_context`` for each +encrypted file. It contains an opaque reference to the host-file handle +``host_file_handle``, the initial encryption key ``kdk`` (Key Derivation Key), +the mode in which file is opened ``mode``, and references to three other +important structs: + +- ``metadata_node`` points to a bounce buffer that syncs the metadata node + between the SGX enclave and the host storage, +- ``metadata_decrypted`` points to a data struct that contains the decrypted + part of the metadata node's encrypted header, +- ``root_mht_node`` points to a data struct that represents the root MHT (Merkle + Hash Tree) node. + +Note that bounce buffers are used to prevent TOCTOU (Time of Check to Time of +Use) attacks and to prevent potential leakage of partially encrypted/decrypted +file contents. + +Encrypted files on host storage are represented as a string of 4KB chunks. Each +encrypted file starts with a *metadata node*, that has the following three +parts: + +1. The plaintext header, occupying bytes 0-57. The header contains a magic + string, a major version of the encrypted-files protocol, a minor version, a + nonce for KDF (Key Derivation Function, explained in the next section) and a + MAC (cryptographic hash over the encrypted header). +2. The encrypted header, occupying bytes 58-3941. This header has two parts: the + encrypted metadata fields and the first 3KB of actual file contents. The + metadata fields contain a file path (to prevent rename attacks), the file + size (to hide the exact file size from attackers) and the encryption key and + MAC of the root MHT node (explained later). +3. The constant padding, occupying bytes 3942-4095. This padding is added purely + to align the metadata node on the 4KB boundary and contains zeros. + +Note that if the original file is less than 3KB in size, then this file's +representation on the host constitutes only a single metadata node (in +particular, there is *no* root MHT node in this case). We will see below the +exact read/write flows for this special case. + +After the metadata node, the two node types interleave: the *MHT nodes* and the +*Data nodes*. The data nodes simply contain 4KB of ciphertext corresponding to +the 4KB of plaintext file contents. The MHT nodes serve as building blocks for a +variant of a Merkle Hash Tree. + +Each MHT node in the Merkle Hash Tree is comprised of 128 encryption key + MAC +pairs for attached Data and MHT nodes. In particular, one MHT node has 96 pairs +for the Data nodes attached to it, and 32 pairs for the child MHT nodes. Since +each key is 16B in size and each MAC is 16B in size, 128 pairs is the maximum +that can be stored in a 4KB node. + +Inside the SGX enclave, each MHT node is represented as a data struct with the +``type`` being ``MHT_NODE`` and two linked buffers: the bounce buffer that +contains the encrypted 4KB copied from the host disk and yet another data +struct that contains the decrypted MHT node's contents (the array with 128 key + +MAC pairs). Additionally, each MHT node has a ``logical_node`` number and a +``physical_node`` number. The former is the serial number in a logical +representation of the MHT nodes in the Merkle tree, whereas the latter is the +number of the page (chunk) in the on-storage representation. The difference +between logical and physical numbers is clear on the below diagram. + +Note that there is a special MHT node -- the root MHT node. It has the same +representation inside the SGX enclave and on host storage as all other MHT +nodes, but it is directly linked from the main data struct ``pf_context`` via +the ``root_mht_node`` field. Also, the root MHT node's encryption key and MAC +are stored directly in the encrypted header of the metadata node. The root MHT +node starts to be used when the plaintext file size exceeds 3KB. + +Note that the root MHT node is kept in trusted enclave memory for the lifetime +of the file handle (i.e., as long as the file is opened). This is in contrast to +other MHT nodes which can be evicted from enclave memory; see the notes on LRU +cache in :ref:`additional-details`. The fact that the root MHT node is +non-evictable ensures protection against rollback/replay attacks. + +.. image:: ../img/encfiles/03_encfiles_layout.svg + :target: ../img/encfiles/03_encfiles_layout.svg + :alt: Figure: Merkle Hash Tree of an encrypted file and file layout on host storage. + +The diagram above shows the in-enclave-memory structure of the nodes that +constitute a single encrypted file, as well as the on-disk data layout of the +same file. This diagram visualizes the difference between logical and physical +node numbers: the former are used to calculate the offsets in plaintext file, +whereas the latter are used to calculate the offsets in encrypted file. Knowing +the offset in the plaintext file, it is easy to calculate the logical node +number; knowing the logical node number, it is easy to calculate the physical +node number; finally, knowing the physical node number, it is trivial to +calculate the offset in a file on the host storage. + +Here is a C code snippet of how the calculation is done:: + + #define PF_NODE_SIZE 4096 + #define MD_USER_DATA_SIZE 3072 + #define ATTACHED_DATA_NODES_COUNT 96 + #define CHILD_MHT_NODES_COUNT 32 + + logical_data_node_number = (plaintext_file_offset - MD_USER_DATA_SIZE) / PF_NODE_SIZE; + logical_mht_node_number = logical_data_node_number / ATTACHED_DATA_NODES_COUNT; + + physical_data_node_number = logical_data_node_number + + 1 // metadata node + + 1 // MHT root node + + logical_mht_node_number; // MHT nodes in-between + + physical_mht_node_number = _physical_data_node_number + - logical_data_node_number % ATTACHED_DATA_NODES_COUNT + - 1; + + encrypted_file_offset = physical_data_node_number * PF_NODE_SIZE + +Encrypted I/O: case of file size less than 3KB +---------------------------------------------- + +Below are the flows for a special case of encrypted-file I/O, for files with +sizes less than 3KB. Such files are represented on the host using a single +metadata node. + +.. image:: ../img/encfiles/04_encfiles_write_less3k.svg + :target: ../img/encfiles/04_encfiles_write_less3k.svg + :alt: Figure: Write flow for an encrypted file with size less than 3KB. + +Assume an encrypted file created by the application. The file is first +represented solely in SGX enclave memory and is saved to untrusted host storage +on a write (or more typically, on an explicit flush operation). + +Upon file creation, Gramine sets up three data structures representing the file: +the main ``pf_context`` struct that has the reference to the corresponding host +file and the user-supplied KDK, the ``metadata_node`` bounce buffer that will be +copied out to host storage and the ``metadata_decrypted`` struct that has the +file name, the file size and a 3KB buffer to hold file contents. + +In step 1, the application writes less than 3KB of data into the file. This data +is copied from the user buffer into the ``file_data`` buffer. This ``write()`` +system call triggers the flow of encrypting the file and saving it to disk. + +To encrypt the file, Gramine needs to generate a new key. To this end, a KDF +nonce is randomly generated in step 2. Then in step 3, a `NIST SP800-108 +`__ KDF (Key Derivation +Function) is used to derive an AES-128 sub-key, with input materials being the +KDK, the hard-coded label ``SGX-PROTECTED-FS-METADATA-KEY`` and the nonce. + +Now that a new key was derived, the file can be encrypted. Step 4 shows that the +AES-GCM encryption happens in the ``metadata_node`` bounce buffer, on the +plaintext data struct ``metadata_decrypted`` and with the newly derived key. + +Finally in step 5, the resulting ciphertext is copied out from the bounce buffer +to the host storage. An additional plaintext header in bytes 0-57 is prepended +to the ciphertext, and the padding in bytes 3942-4095 aligns the resulting +metadata node to 4KB. Note that the plaintext header contains the KDF nonce +generated in step 2 and the MAC generated as a by-product of AES-GCM encryption +in step 4. The nonce and the MAC can be stored in plaintext, and they will be +used later to decrypt the metadata node's ciphertext. + +.. image:: ../img/encfiles/05_encfiles_read_less3k.svg + :target: ../img/encfiles/05_encfiles_read_less3k.svg + :alt: Figure: Read flow for an encrypted file with size less than 3KB. + +Now assume that an encrypted file previously created by the application must be +read by another application. The application opens a file with the ``open()`` +system call which instructs Gramine to set up the same three data structures +representing the file as for the write flow. Note that the KDK must have been +already supplied by the user application, and must be the same as was used for +file write. + +Then the app wants to read the file data. This triggers the read flow depicted +in the diagram above. The encrypted file is represented on the untrusted storage +as a single 4KB metadata node, which consists of a plaintext header, an +encrypted part, and an unused padding. + +In step 1, the metadata node is copied into the enclave's bounce buffer +``metadata_node``. The actual file contents are stored in ``file_data`` which is +located in the encrypted-header part of the metadata node. Thus, Gramine must +decrypt the encrypted header. To obtain the same key that was used for +encryption, a KDF nonce is read from the plaintext header in ``metadata_node`` +(step 2). Then in step 3, AES-CMAC is used for key derivation, with input +materials being the KDK and the nonce. + +Now that the key is derived, the metadata's encrypted header can be decrypted. +Step 4 shows that the AES-GCM decryption happens on the ``metadata_node`` bounce +buffer, with plaintext output moved into the data struct ``metadata_decrypted``. +As part of the decryption operation, the resulting MAC is compared against the +one read from the plaintext header in ``metadata_node``. If comparison fails, +then Gramine stops operations on this encrypted file and considers it corrupted; +an ``-EACCES`` error is returned to the application. + +Finally in step 5, the resulting ``file_data`` plaintext is copied to the +application buffer. The ``read()`` operation is finished. + +Note that in the special case of files of size less than 3KB, only the metadata +node is used. No MHT nodes and no data nodes are stored on the host. Also, the +``root_mht_node_key`` and ``root_mht_node_mac`` fields are unused in the +metadata node's encrypted header. + +Encrypted I/O: general case +--------------------------- + +Below are the flows for the general case of encrypted-file I/O, i.e., for files +with sizes greater than 3KB. + +.. image:: ../img/encfiles/06_encfiles_write_greater3k.svg + :target: ../img/encfiles/06_encfiles_write_greater3k.svg + :alt: Figure: Write flow for an encrypted file with size greater than 3KB. + +Assume an encrypted file created by the application. The application writes more +than 3KB of data into this file. + +The write flow contains similar steps to the flow described for files of less +than 3KB size above. We will only briefly outline the logic. + +The first 3KB of user-supplied data are copied into the ``file_data`` buffer of +the metadata node (step 1). The next 4KB of user-supplied data must be copied in +a data node. When Gramine notices that a new data node is required, it creates +the data node representation in enclave memory, consisting of the main data-node +struct, the ``decrypted`` 4KB buffer and the ``encrypted`` 4KB bounce buffer +(step 2). The file data are copied into the ``decrypted`` buffer. + +Since we have at least one data node, we must have a corresponding MHT node to +which this data node will be attached. Thus Gramine activates the root MHT node +representation in enclave memory, consisting of the main MHT-node struct, the +``decrypted`` 4KB array and the ``encrypted`` 4KB bounce buffer. Note that there +is no need to link the data node and the root MHT node explicitly -- a +correspondence between these nodes can be established via calculations on +logical and physical numbers of the nodes (see the C code snippet above). + +Now to encrypt the 4KB of file contents stored in the data node's ``decrypted`` +buffer, Gramine needs to generate a new key. The key is simply a 128-bit random +number (step 3). This key is stored in a corresponding slot of the root MHT +node. Since the MHT node's contents will also be encrypted, the key will not be +leaked. + +Now that a new key for the data node was generated, the data node can be +encrypted. Step 4 shows that the AES-GCM encryption happens in the ``encrypted`` +bounce buffer of the data node, on the plaintext data-node buffer ``decrypted`` +and with the newly generated key. As part of this encryption operation, the MAC +is generated and is stored in the corresponding slot of the root MHT node (thus +shaping a key + MAC pair for data node 1). + +At this point, the 4KB of the file data are stored as ciphertext in the bounce +buffer of the data node and are ready to be flushed to storage. However, the +root MHT node must also be encrypted and flushed. + +The root MHT node is already updated with the data node's key and MAC (more +specifically, only slot 1 of the MHT node's ``decrypted`` array was updated, the +rest slots contain all-zeros). So it's only a matter of encrypting the root MHT +node. For this, a new random key is generated (step 5). This key is stored in +the ``root_mht_node_key`` field of the metadata node's header. Since the header +will be encrypted, the key will not be leaked. + +Now that a key for the root MHT node was generated, the root MHT node can be +encrypted. Step 6 shows that the AES-GCM encryption happens in the ``encrypted`` +bounce buffer of the root MHT node, on the plaintext root-MHT-node ``decrypted`` +and with the newly generated key. As part of this encryption operation, the MAC +is generated and is stored in the ``root_mht_node_mac`` field of the metadata +node's header. + +At this point, both the data node and the root MHT node are ready to be flushed +to storage. Now steps 7-9 are performed, which correspond to steps 2-4 in the +write flow of the <3KB file. + +Finally, all three nodes are encrypted and are ready to be flushed: the metadata +node (contains the nonce to decrypt itself and the key + MAC to decrypt the root +MHT node), the root MHT node (contains the key + MAC to decrypt the data node) +and the data node (contains the file contents). Step 10 can be performed, that +copies out all three bounce buffers to the host's hard disk. + +The above description works for a case of a file with at most 7KB of data (3KB +stored in metadata header and 4KB stored in the data node). The diagram below +shows a generalized flow for files of arbitrary sizes; the step numbers in the +diagram correspond to the steps in the above description. + +.. image:: ../img/encfiles/07_encfiles_write_greater3k_general.svg + :target: ../img/encfiles/07_encfiles_write_greater3k_general.svg + :alt: Figure: Generic write flow for an encrypted file with size greater than 3KB. + +Now assume that an encrypted file previously created by the application must be +read by another application. The file size is greater than 3KB in size. + +.. image:: ../img/encfiles/08_encfiles_read_greater3k.svg + :target: ../img/encfiles/08_encfiles_read_greater3k.svg + :alt: Figure: Read flow for an encrypted file with size greater than 3KB. + +The read flow contains similar steps to the flow described for files of less +than 3KB size above. We will only briefly outline the logic. + +The first 3KB of file data must be copied from the ``file_data`` buffer of the +metadata node. The next 4KB of file data must be copied from the data node. When +Gramine notices that the file size exceeds 3KB, it creates the data node +representation in enclave memory, consisting of the main data-node struct, the +``decrypted`` 4KB buffer and the ``encrypted`` 4KB bounce buffer. Gramine also +activates the root MHT node representation in enclave memory. The file data will +be decrypted and then copied into the ``decrypted`` buffer. The root MHT node +will have the key and MAC for the data-node decryption. + +First the steps 1-4 are performed, which correspond to same steps 1-4 in the +read flow of the <3KB file. Then in step 5, the root MHT node is copied into the +enclave memory. The AES-GCM decryption of the root MHT node is performed using +the ``root_mht_node_key`` key and the comparison against ``root_mht_node_mac`` +(step 6). The resulting plaintext is the array of key-MAC pairs, stored in the +``decrypted`` field. Then in step 7, the data node is copied into the enclave +memory. The AES-GCM decryption of the data node is performed using the key and +MAC stored in the first slot of the root MHT node's array (step 8). + +At this point, the first 3KB of file data are stored in plaintext in the +``file_data`` buffer and the last 4KB of file data are stored in plaintext in +the ``decrypted`` buffer of the data node. The application's ``read()`` system +call can populate the user-supplied buffer with this data (steps 9 and 10). + +The above description works for a case of a file with at most 7KB of data (3KB +stored in metadata header and 4KB stored in the data node). The diagram below +shows a generalized flow for files of arbitrary sizes; the step numbers in the +diagram correspond to the steps in the above description. + +.. image:: ../img/encfiles/09_encfiles_read_greater3k_general.svg + :target: ../img/encfiles/09_encfiles_read_greater3k_general.svg + :alt: Figure: Generic read flow for an encrypted file with size greater than 3KB. + +.. _additional-details: + +Additional details +------------------ + +- Performance optimization: there is a separate LRU cache of nodes for each + opened encrypted file. This LRU cache can host up to 48 data or MHT nodes. + Note that the metadata node and the root MHT node are *not* hosted in the LRU + cache because they are never evicted (i.e., they stay in enclave memory for + the whole encrypted-file lifetime). Also note that if a data node is brought + into the cache, the whole chain of corresponding MHT nodes is also brought + into the cache. + +- There is *limited* multiprocess support for encrypted files. This means that + if the same file is accessed concurrently by two Gramine processes (and at + least one process writes to the file), the file may become corrupted or + inaccessible to one of the processes. + +- There is no support for file recovery, if the file was only partially written + to storage. Gramine will treat this file as corrupted and will return an + ``-EACCES`` error. (This is in contrast to Intel SGX SDK which supports file + recovery.) + +- There is no key rotation scheme. The application must perform key rotation of + the KDK by itself (by overwriting the ``/dev/attestation/keys/`` + pseudo-files). Some support for key rotation may appear in future releases of + Gramine. + + - It is worth pointing out that the format of encrypted files mostly uses + one-time keys. The KDK is only used to derive the metadata-node key, thus it + produces much less ciphertext than if it would be used to directly encrypt + file data. Therefore, the usual NIST limits on the total number of + invocations of the encryption operation with the same key would be reached + much slower. diff --git a/Documentation/img/encfiles/01_encfiles_datastructs.svg b/Documentation/img/encfiles/01_encfiles_datastructs.svg new file mode 100644 index 0000000000..14d248e70b --- /dev/null +++ b/Documentation/img/encfiles/01_encfiles_datastructs.svg @@ -0,0 +1 @@ +fd1=open(“file.enc”)fd2=dup(fd1)fd3=open(“file.enc”)fd4=dup(fd3)libos_handle1inodelibos_encrypted_filelibos_encrypted_key(128-bit AES-GCM root encryption key)pf_context(in-enclave metadata & buffers)pal_handle(out-of-enclave file on host)host_file_handlekdkmode = READ|WRITEmetadata_nodemetadata_decryptedroot_mht_nodelibos_handle2same keysame handleApplicationGramine LibOSGramine Encrypted FilesGlossary:KDK = Key Derivation KeyMHT = Merkle Hash Tree \ No newline at end of file diff --git a/Documentation/img/encfiles/02_encfiles_representation.svg b/Documentation/img/encfiles/02_encfiles_representation.svg new file mode 100644 index 0000000000..e00651413e --- /dev/null +++ b/Documentation/img/encfiles/02_encfiles_representation.svg @@ -0,0 +1 @@ +#0: Metadata node (first 58 bytesin plaintext)host_file_handlekdkmode = READ|WRITEmetadata_nodemetadata_decryptedroot_mht_nodeGlossary:KDK = Key Derivation KeyMHT = Merkle Hash TreeKDF = Key Derivation Function“GRAFS_PF”0x010x00KDF nonceMAC07894157encrypted metadataover583941Constant padding39424095file_path[772]file_sizeroot_mht_node_keyroot_mht_node_macfile_data[3072]type = MHT_NODElogical_node= 0physical_node= 1encrypteddecrypted#1: Root MHT node (all fields encrypted)data_key1data_mac1mht_key1mht_mac1data_key1data_mac1data_key96data_mac96mht_key1mht_mac1mht_key32mht_mac320153130724095BouncebufferBounce bufferSGX enclave (trusted)Host storage (untrusted)3071encrypted file dataArrows legend:Encrypted on storageCopied in/out of enclave869 \ No newline at end of file diff --git a/Documentation/img/encfiles/03_encfiles_layout.svg b/Documentation/img/encfiles/03_encfiles_layout.svg new file mode 100644 index 0000000000..11bf00a1ed --- /dev/null +++ b/Documentation/img/encfiles/03_encfiles_layout.svg @@ -0,0 +1 @@ +#0: Metadata#1: Root MHT0#2: Data0#97: Data95#98: MHT1#99: Data96#3106: MHT32#3202: Data3072#3203: MHT33In-memory data structure:On-disk data layout:MetadataMHT0Data0Data95MHT1Data96MHT32Data3072MHT33ABEncryption key and MACof A is stored in B#X: DataY0thpage1232033204#X physical node X(4KB page on the disk)DataYlogical node Y(separate counting for MHT and Data nodes)Notes: \ No newline at end of file diff --git a/Documentation/img/encfiles/04_encfiles_write_less3k.svg b/Documentation/img/encfiles/04_encfiles_write_less3k.svg new file mode 100644 index 0000000000..1e772be333 --- /dev/null +++ b/Documentation/img/encfiles/04_encfiles_write_less3k.svg @@ -0,0 +1 @@ +#0: Metadata node (first 58 bytes in plaintext)host_file_handlekdkmode = READ|WRITEmetadata_nodemetadata_decryptedroot_mht_nodeGlossary:KDK = Key Derivation KeyMHT = Merkle Hash TreeKDF = Key Derivation Function“GRAFS_PF”0x010x00KDF nonceMAC07894157encrypted metadataover583941Constant padding39424095file_path[772]file_sizeroot_mht_node_keyroot_mht_node_macfile_data[3072]type = MHT_NODElogical_node= 0physical_node= 1encrypteddecrypted#1: Root MHT node (all fields encrypted)data_key1data_mac1mht_key1mht_mac1data_key1data_mac1data_key96data_mac96mht_key1mht_mac1mht_key32mht_mac320153130724095BouncebufferBounce bufferSGX enclave (trusted)Host storage (untrusted)3071encrypted file dataArrows legend:Encrypted on storageCopied out of enclaveUnused in this caseCase of file size < 3KB:Encrypting new fileKDF nonce = rand()derive keyencryptcopy out869file_data← write(buf) \ No newline at end of file diff --git a/Documentation/img/encfiles/05_encfiles_read_less3k.svg b/Documentation/img/encfiles/05_encfiles_read_less3k.svg new file mode 100644 index 0000000000..9d87f8860c --- /dev/null +++ b/Documentation/img/encfiles/05_encfiles_read_less3k.svg @@ -0,0 +1 @@ +#0: Metadata node (first 58 bytes in plaintext)host_file_handlekdkmode = READ|WRITEmetadata_nodemetadata_decryptedroot_mht_nodeGlossary:KDK = Key Derivation KeyMHT = Merkle Hash TreeKDF = Key Derivation Function“GRAFS_PF”0x010x00KDF nonceMAC07894157encrypted metadataover583941Constant padding39424095file_path[772]file_sizeroot_mht_node_keyroot_mht_node_macfile_data[3072]type = MHT_NODElogical_node= 0physical_node= 1encrypteddecrypted#1: Root MHT node (all fields encrypted)data_key1data_mac1mht_key1mht_mac1data_key1data_mac1data_key96data_mac96mht_key1mht_mac1mht_key32mht_mac320153130724095BouncebufferBounce bufferSGX enclave (trusted)Host storage (untrusted)3071encrypted file dataArrows legend:Decrypted from storageCopied into enclaveUnused in this caseCase of file size < 3KB:Decrypting existing fileKDF nonce = loadedderive keydecryptcopy in (on file open)869read(buf) ← file_data \ No newline at end of file diff --git a/Documentation/img/encfiles/06_encfiles_write_greater3k.svg b/Documentation/img/encfiles/06_encfiles_write_greater3k.svg new file mode 100644 index 0000000000..dd8802f3f0 --- /dev/null +++ b/Documentation/img/encfiles/06_encfiles_write_greater3k.svg @@ -0,0 +1 @@ +#0: Metadata node (first 58 bytes in plaintext)host_file_handlekdkmode = READ|WRITEmetadata_nodemetadata_decryptedroot_mht_nodeGlossary:KDK = Key Derivation KeyMHT = Merkle Hash TreeKDF = Key Derivation Function“GRAFS_PF”0x010x00KDF nonceMAC07894157encrypted metadataover583941Constant padding39424095file_path[772]file_sizeroot_mht_node_keyroot_mht_node_macfile_data[3072]type = MHT_NODElogical_node= 0physical_node= 1encrypteddecrypted#1: Root MHT node (all fields encrypted)data_key1data_mac1data_key96data_mac96mht_key1mht_mac1mht_key32mht_mac320153130724095BouncebufferBounce bufferSGX enclave (trusted)Host storage (untrusted)3071encrypted file dataArrows legend:Encrypted on storageCopied out of enclaveUpdates MHT dataCase of file size > 3KB:Encrypting new fileKDF nonce = rand()derive keyencryptcopy out869file_data← write(buf) …type = DATA_NODElogical_node= 0physical_node= 2encrypteddecryptedBouncebuffer#2: Data node (all encrypted)new.decrypted← write(buf)key = rand()encryptkey = rand()encryptcopy outcopy out \ No newline at end of file diff --git a/Documentation/img/encfiles/07_encfiles_write_greater3k_general.svg b/Documentation/img/encfiles/07_encfiles_write_greater3k_general.svg new file mode 100644 index 0000000000..6bae355a60 --- /dev/null +++ b/Documentation/img/encfiles/07_encfiles_write_greater3k_general.svg @@ -0,0 +1 @@ +#0: Metadata#1: Root MHT0#98: MHT1#3203: MHT33KDKwrite(buf)BouncebufferkeyMACBouncebufferkeyMACBouncebufferkeyMACBouncebufferRoot MHT keyRoot MHT MACderived keyrandom nonceBouncebufferFirst 58 bytes of metadata node are stored in plaintextCase of file size > 3KB:Encrypting new fileRest of metadata node and MHT/data nodes encryptedMetadata MAC \ No newline at end of file diff --git a/Documentation/img/encfiles/08_encfiles_read_greater3k.svg b/Documentation/img/encfiles/08_encfiles_read_greater3k.svg new file mode 100644 index 0000000000..696f2aeb68 --- /dev/null +++ b/Documentation/img/encfiles/08_encfiles_read_greater3k.svg @@ -0,0 +1 @@ +#0: Metadata node (first 58 bytes in plaintext)host_file_handlekdkmode = READ|WRITEmetadata_nodemetadata_decryptedroot_mht_nodeGlossary:KDK = Key Derivation KeyMHT = Merkle Hash TreeKDF = Key Derivation Function“GRAFS_PF”0x010x00KDF nonceMAC07894157encrypted metadataover583941Constant padding39424095file_path[772]file_sizeroot_mht_node_keyroot_mht_node_macfile_data[3072]type = MHT_NODElogical_node= 0physical_node= 1encrypteddecrypted#1: Root MHT node (all fields encrypted)data_key1data_mac1data_key96data_mac96mht_key1mht_mac1mht_key32mht_mac320153130724095BouncebufferBounce bufferSGX enclave (trusted)Host storage (untrusted)3071encrypted file dataArrows legend:Decrypted from storageCopied into enclaveCase of file size > 3KB:Decrypting existing fileKDF nonce = loadedderive keydecrypt869type = DATA_NODElogical_node= 0physical_node= 2encrypteddecryptedBouncebuffer#2: Data node (all encrypted)decryptdecryptcopy incopy incopy in (on file open)… read(buf) ← node.decryptedread(buf) ← file_data \ No newline at end of file diff --git a/Documentation/img/encfiles/09_encfiles_read_greater3k_general.svg b/Documentation/img/encfiles/09_encfiles_read_greater3k_general.svg new file mode 100644 index 0000000000..624958edfc --- /dev/null +++ b/Documentation/img/encfiles/09_encfiles_read_greater3k_general.svg @@ -0,0 +1 @@ +#0: Metadata#1: Root MHT0#98: MHT1#3203: MHT33KDKread(buf)BouncebufferkeyMACBouncebufferkeyMACBouncebufferkeyMACBouncebufferRoot MHT keyRoot MHT MACderived keyloaded nonceBouncebufferCase of file size > 3KB:Decrypting existing fileMetadata MACFirst 58 bytes of metadata node are stored in plaintextRest of metadata node and MHT/data nodes encrypted \ No newline at end of file diff --git a/Documentation/index.rst b/Documentation/index.rst index 7c977d8e8d..64df0846c1 100644 --- a/Documentation/index.rst +++ b/Documentation/index.rst @@ -136,9 +136,11 @@ The resources page contains a description of features of Gramine, a list of maintainers, a list of users of Gramine, introduction to the Intel SGX technology and a glossary to help you with any questions you may have. -- :doc:`devel/features` -- This page has a comprehensive description of - implemented and unimplemented features of Gramine, including the lists of - available system calls and pseudo-files. +- :doc:`devel/features` -- This page describes the implemented and unimplemented + features of Gramine, including the lists of available system calls and + pseudo-files. +- :doc:`devel/encfiles` -- This page describes the Encrypted Files feature of + Gramine. - :doc:`management-team` - This page lists maintainers of Gramine. - :doc:`gramine-users` - See what companies use Gramine for their confidential computing needs. @@ -226,6 +228,7 @@ Indices and tables :maxdepth: 1 devel/features + devel/encfiles management-team gramine-users sgx-intro