Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key Map Implementation - Replacing TDE Forks #113

Merged
merged 5 commits into from
Feb 19, 2024
Merged

Conversation

hqakhtar
Copy link
Contributor

This PR has two commits with one commit including some refactoring, and the other commit includes implementation of the key map architecture. Unfortunately, since most of this code was written back in December, it is very hard to recreate a separate PR for the refactoring part. Therefore, for ease, I've created a separate commit for it.

Copy link
Collaborator

@dutow dutow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments.

My main issue is: the master key name handling seems to be completely incorrect.

Generally, we have two names:

  • A generic name, master-key
  • A specific version, master-key-X, e.g. master-key-13

An entire TDE file always has to be encrypted entirely with the same key version, otherwise we won't be able to decrypt it correctly.

This means that:

  • If the TDE file doesn't exists, we have to use the generic name (master-key), lookup the latest master key version master-key-X using getLatestKey / generateKey, and then save the name of the specific key it gives us (with the version number) to the file.
  • When adding a key to an existing file, we have to lookup the specific master key name stored in the file (master-key-X), use keyringGetKey to retrieve it, and encrypt the new internal key using that specific master key
  • When reading an existing file, similarly we have to use keyringGetKey to retrieve the specific version specified in the file
  • When rotating the master key, we read the file using the existing earlier key as above, but then generate a new version using generateKey, and then rewrite the entire file using the new version.

src/encryption/enc_aes.c Outdated Show resolved Hide resolved
* short lifespan until it is written to disk.
*/
void
AesEncryptKey(const keyInfo *master_key_info, RelKeysData *rel_key_data, RelKeysData **p_enc_rel_key_data, size_t *enc_key_bytes)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function shouldn't be in this file: this is where we have code that only deals with encryption and openssl. The code that connects this to postgres codo is in enc_tuple.c

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's essentially our interface to the SSL functions. Hence, I put these two functions here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this function already uses the functions defined in enc_aes.h. The idea is that enc_aes uses OpenSSL directly, and doesn't depend on postgres/pg_tde headers - see the includes above, it had no entries related to these, before this PR.

Anything that depends on postgres/pg_tde headers is implemented in enc_tuple.h.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

src/encryption/enc_aes.c Outdated Show resolved Hide resolved
src/encryption/enc_aes.c Outdated Show resolved Hide resolved
src/encryption/enc_aes.c Outdated Show resolved Hide resolved
src/access/pg_tde_tdemap.c Show resolved Hide resolved
src/access/pg_tde_tdemap.c Outdated Show resolved Hide resolved
src/access/pg_tde_tdemap.c Show resolved Hide resolved
src/access/pg_tde_tdemap.c Outdated Show resolved Hide resolved
the old "tde" fork architecture.

This new architecture implements a two file pair with:
(1) Map File
(2) Key Data File

Both files contain a header that contains the name of the master key that
was to encrypt the data keys and a file version. The file version is set
to PG_TDE_FILEMAGIC at the moment and it can be used to differiate between
different file format versions in case we change the structure later on.

The map file is a list of relNumber, flags and key index.
- relNumber is the Oid of the associated relation.
- Flags define if the map entry is free or in use.
- Key index points to the starting position of the key in the key data file.

The flags play a pivotal role in avoiding the file to grow infinitely. When
a relation is either deleted or a transaction is aborted, the entry map entry
is marked as MAP_ENTRY_FREE. Any next transaction requiring to store its
relation key will pick the first entry with flag set to MAP_ENTRY_FREE.

The key data file is simply a list of keys. No flags are needed as the validity
is identified by the map file. Writing to the file is performed using FileWrite
function. This avoids any locking in the key data file.

Pending:
- Implementation of key rotation
- Locking of file during key rotation or map entry
- Review of fflush calls
- Review of the WAL
@hqakhtar hqakhtar reopened this Feb 7, 2024
src/access/pg_tde_tdemap.c Outdated Show resolved Hide resolved
src/access/pg_tde_tdemap.c Outdated Show resolved Hide resolved
src/access/pg_tde_tdemap.c Outdated Show resolved Hide resolved
src/access/pg_tde_tdemap.c Show resolved Hide resolved
src/access/pg_tde_tdemap.c Show resolved Hide resolved
src/access/pg_tde_tdemap.c Show resolved Hide resolved
src/access/pg_tde_tdemap.c Show resolved Hide resolved
* short lifespan until it is written to disk.
*/
void
AesEncryptKey(const keyInfo *master_key_info, RelKeysData *rel_key_data, RelKeysData **p_enc_rel_key_data, size_t *enc_key_bytes)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this function already uses the functions defined in enc_aes.h. The idea is that enc_aes uses OpenSSL directly, and doesn't depend on postgres/pg_tde headers - see the includes above, it had no entries related to these, before this PR.

Anything that depends on postgres/pg_tde headers is implemented in enc_tuple.h.

Hamid Akhtar added 3 commits February 14, 2024 11:56
Moving the key encryption/decryption functions to the enc_tuple file and
renaming the files according to the functionality.
and when redo-ing the log.

Also, updated the handling of master key to accomodate versioning.
src/access/pg_tde_tdemap.c Outdated Show resolved Hide resolved
* doGenerateKey should be false and doRaiseError should be set to indicate
* that master key is expected but could not be accessed.
*/
const keyInfo* getMasterKey(const char* internalName, int doGenerateKey, int doRaiseError)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think doGenerateKey, doRaiseError should be bool not int - in a sake of the interface clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. 🤦

rel_key_data = tde_create_rel_key(rlocator, key, master_key_info);
enc_rel_key_data = tde_encrypt_rel_key(master_key_info, rel_key_data);

/* Get the file paths */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* Get the file paths */
/* Set the file paths */

/*
* Add the encyrpted key to the key map data file structure.
*/
pg_tde_write_key_map_entry(newrlocator, enc_rel_key_data, master_key_info);
Copy link
Member

@dAdAbird dAdAbird Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to add a new internal key to the memory cache as well, as there is a big chances that the key will be used (on INSERT) right after the relation creation. This is how it was before and I think it makes sense from the performance perspective

- getMasterKey function argument types to bool from int
- Before xlog redo, the decrypted key is added to the key cache.
@hqakhtar hqakhtar merged commit 56af84e into percona:main Feb 19, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants