backends/igvm: Fix memory leak #12

peterfang · 2024-05-07T10:00:52Z

Always pair igvm_get_buffer() with igvm_free_buffer() before exiting the loop in supported_platform_compat_mask().

pc_machine_kvm_type() was introduced by commit e21be72 ("i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode") to do Xen specific initialization by utilizing kvm_type method. commit eeedfe6 ("hw/xen: Simplify emulated Xen platform init") moves the Xen specific initialization to pc_basic_device_init(). There is no need to keep the PC specific kvm_type() implementation anymore. On the other hand, later patch will implement kvm_type() method for all x86/i386 machines to support KVM_X86_SW_PROTECTED_VM. Signed-off-by: Xiaoyao Li <[email protected]> Reviewed-by: Isaku Yamahata <[email protected]> Reviewed-by: David Hildenbrand <[email protected]>

Decrease array index cpuid_i when CPUID leaf 1F is skipped, otherwise it will get an all zero'ed CPUID entry with leaf 0 and subleaf 0. It conflicts with correct leaf 0. Signed-off-by: Xiaoyao Li <[email protected]>

No such constraint that subleaf index needs to be less than 64. Signed-off-by: Xiaoyao Li <[email protected]>

…ry() The upper 16 bits of kvm_userspace_memory_region::slot are address space id. Parse it separately in trace_kvm_set_user_memory(). Signed-off-by: Xiaoyao Li <[email protected]>

This patch needs to be updated by script scripts/update-linux-headers.sh once gmem fd support is upstreamed in Linux kernel. Signed-off-by: Xiaoyao Li <[email protected]>

Add KVM guest_memfd support to RAMBlock so both normal hva based memory and kvm guest memfd based private memory can be associated in one RAMBlock. Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to create private guest_memfd during RAMBlock setup. Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of confidential guests, such as TDX VM. How and when to set it for memory backends will be implemented in the following patches. Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has KVM guest_memfd allocated. Signed-off-by: Xiaoyao Li <[email protected]> --- Changes in v3: - rename gmem to guest_memfd; - close(guest_memfd) when RAMBlock is released; (Daniel P. Berrangé) - Suqash the patch that introduces memory_region_has_guest_memfd().

KVM allows KVM_GUEST_MEMFD_ALLOW_HUGEPAGE for guest memfd. When the flag is set, KVM tries to allocate memory with transparent hugeapge at first and falls back to non-hugepage on failure. However, KVM defines one restriction that size must be hugepage size aligned when KVM_GUEST_MEMFD_ALLOW_HUGEPAGE is set. Signed-off-by: Xiaoyao Li <[email protected]> --- v3: - New one in v3.

Add a new member "require_guest_memfd" to memory backends. When it's set to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm guest_memfd will be allocated during RAMBlock allocation. Memory backend's @require_guest_memfd is wired with @require_guest_memfd field of MachineState. MachineState::require_guest_memfd is supposed to be set by any VMs that requires KVM guest memfd as private memory, e.g., TDX VM. Signed-off-by: Xiaoyao Li <[email protected]>

Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM. With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that backend'ed both by hva-based shared memory and guest memfd based private memory. Signed-off-by: Chao Peng <[email protected]> Co-developed-by: Xiaoyao Li <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>

Introduce the helper functions to set the attributes of a range of memory to private or shared. This is necessary to notify KVM the private/shared attribute of each gpa range. KVM needs the information to decide the GPA needs to be mapped at hva-based shared memory or guest_memfd based private memory. Signed-off-by: Xiaoyao Li <[email protected]>

…card_range() Commit d3a5038 ("exec: ram_block_discard_range") introduced ram_block_discard_range() which grabs some code from ram_discard_range(). However, during code movement, it changed alignment check of host_startaddr from qemu_host_page_size to rb->page_size. When ramblock is back'ed by hugepage, it requires the startaddr to be huge page size aligned, which is a overkill. e.g., TDX's private-shared page conversion is done at 4KB granularity. Shared page is discarded when it gets converts to private and when shared page back'ed by hugepage it is going to fail on this check. So change to alignment check back to qemu_host_page_size. Signed-off-by: Xiaoyao Li <[email protected]> --- Changes in v3: - Newly added in v3;

…ge() Use __func__ to avoid hard-coded function name. Signed-off-by: Xiaoyao Li <[email protected]>

It's used for discarding opposite memory after memory conversion, for confidential guest. When page is converted from shared to private, the original shared memory can be discarded via ram_block_discard_range(); When page is converted from private to shared, the original private memory is back'ed by guest_memfd. Introduce ram_block_discard_guest_memfd_range() for discarding memory in guest_memfd. Originally-from: Isaku Yamahata <[email protected]> Codeveloped-by: Xiaoyao Li <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>

Currently only KVM_MEMORY_EXIT_FLAG_PRIVATE in flags is valid when KVM_EXIT_MEMORY_FAULT happens. It indicates userspace needs to do the memory conversion on the RAMBlock to turn the memory into desired attribute, i.e., private/shared. Note, KVM_EXIT_MEMORY_FAULT makes sense only when the RAMBlock has guest_memfd memory backend. Note, KVM_EXIT_MEMORY_FAULT returns with -EFAULT, so special handling is added. Signed-off-by: Chao Peng <[email protected]> Co-developed-by: Xiaoyao Li <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>

Signed-off-by: Isaku Yamahata <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>

Implement mc->kvm_type() for i386 machines. It provides a way for user to create SW_PROTECTE_VM. Also store the vm_type in machinestate to other code to query what the VM type is. Signed-off-by: Xiaoyao Li <[email protected]>

TDX VM requires VM type KVM_X86_TDX_VM to be passed to kvm_ioctl(KVM_CREATE_VM). If tdx-guest object is specified to confidential-guest-support, like, qemu -machine ...,confidential-guest-support=tdx0 \ -object tdx-guest,id=tdx0,... it parses VM type as KVM_X86_TDX_VM. Signed-off-by: Xiaoyao Li <[email protected]>

Introduce a separate function kvm_confidential_guest_init(), which dispatches specific confidential guest initialization function by ms->cgs type. Signed-off-by: Xiaoyao Li <[email protected]> Acked-by: Gerd Hoffmann <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]>

…ivate value Introduce new flag RAM_DEFAULT_PRIVATE for RAMBlock. It's used to indicate the default attribute, private or not. Set the RAM range to private explicitly when it's default private. Originated-from: Isaku Yamahata <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>

Introduce memory_region_init_ram_guest_memfd() to allocate private guset memfd on the MemoryRegion initialization. It's for the use case of TDVF, which must be private on TDX case. Signed-off-by: Xiaoyao Li <[email protected]>

In mch_realize(), process PAM initialization before SMRAM initialization so that later patch can skill all the SMRAM related with a single check. Signed-off-by: Isaku Yamahata <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>

Add a q35 property to check whether or not SMM ranges, e.g. SMRAM, TSEG, etc... exist for the target platform. TDX doesn't support SMM and doesn't play nice with QEMU modifying related guest memory ranges. Signed-off-by: Isaku Yamahata <[email protected]> Co-developed-by: Sean Christopherson <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>

This reverts commit a1a6b66aa83d26bab1628b92b7720d5e4b672e7e.

TODO: squash into: kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot Signed-off-by: Michael Roth <[email protected]>

Signed-off-by: Michael Roth <[email protected]>

Currently all SEV/SEV-ES functionality is managed through a single 'sev-guest' QOM type. With upcoming support for SEV-SNP, taking this same approach won't work well since some of the properties/state managed by 'sev-guest' is not applicable to SEV-SNP, which will instead rely on a new QOM type with its own set of properties/state. To prepare for this, this patch moves common state into an abstract 'sev-common' parent type to encapsulate properties/state that is common to both SEV/SEV-ES and SEV-SNP, leaving only SEV/SEV-ES-specific properties/state in the current 'sev-guest' type. This should not affect current behavior or command-line options. As part of this patch, some related changes are also made: - a static 'sev_guest' variable is currently used to keep track of the 'sev-guest' instance. SEV-SNP would similarly introduce an 'sev_snp_guest' static variable. But these instances are now available via qdev_get_machine()->cgs, so switch to using that instead and drop the static variable. - 'sev_guest' is currently used as the name for the static variable holding a pointer to the 'sev-guest' instance. Re-purpose the name as a local variable referring the 'sev-guest' instance, and use that consistently throughout the code so it can be easily distinguished from sev-common/sev-snp-guest instances. - 'sev' is generally used as the name for local variables holding a pointer to the 'sev-guest' instance. In cases where that now points to common state, use the name 'sev_common'; in cases where that now points to state specific to 'sev-guest' instance, use the name 'sev_guest' Signed-off-by: Michael Roth <[email protected]>

SEV-SNP support relies on a different set of properties/state than the existing 'sev-guest' object. This patch introduces the 'sev-snp-guest' object, which can be used to configure an SEV-SNP guest. For example, a default-configured SEV-SNP guest with no additional information passed in for use with attestation: -object sev-snp-guest,id=sev0 or a fully-specified SEV-SNP guest where all spec-defined binary blobs are passed in as base64-encoded strings: -object sev-snp-guest,id=sev0, \ policy=0x30000, \ init-flags=0, \ id-block=YWFhYWFhYWFhYWFhYWFhCg==, \ id-auth=CxHK/OKLkXGn/KpAC7Wl1FSiisWDbGTEKz..., \ auth-key-enabled=on, \ host-data=LNkCWBRC5CcdGXirbNUV1OrsR28s..., \ guest-visible-workarounds=AA==, \ See the QAPI schema updates included in this patch for more usage details. In some cases these blobs may be up to 4096 characters, but this is generally well below the default limit for linux hosts where command-line sizes are defined by the sysconf-configurable ARG_MAX value, which defaults to 2097152 characters for Ubuntu hosts, for example. Co-developed-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Michael Roth <[email protected]>

When SEV-SNP is enabled, the KVM_SNP_INIT command is used to initialize the platform. The command checks whether SNP is enabled in the KVM, if enabled then it allocates a new ASID from the SNP pool and calls the firmware to initialize the all the resources. Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Michael Roth <[email protected]>

The SNP_LAUNCH_START is called first to create a cryptographic launch context within the firmware. Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Michael Roth <[email protected]>

The KVM_SEV_SNP_LAUNCH_UPDATE command is used for encrypting the bios image used for booting the SEV-SNP guest. Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Michael Roth <[email protected]>

Current SNP guest kernels will attempt to access these regions with with C-bit set, so guest_memfd is needed to handle that. Whether guests should actually try to access ROM regions in this way (or need to deal with legacy ROM regions at all), is a separate issue to be addressed on kernel side, but current SNP kernels will exhibit this behavior and so need some sort of handling. Signed-off-by: Michael Roth <[email protected]>

Test kernel now uses hugepages by default. Signed-off-by: Michael Roth <[email protected]>

Signed-off-by: Michael Roth <[email protected]>

SEV uses these to pin pages, SNP doesn't need this since it relies on gmem for any private pages. Signed-off-by: Michael Roth <[email protected]>

The IGVM library allows Independent Guest Virtual Machine files to be parsed and processed. IGVM files are used to configure guest memory layout, initial processor state and other configuration pertaining to secure virtual machines. This adds the --enable-igvm configure option, enabled by default, which attempts to locate and link against the IGVM library via pkgconfig and sets CONFIG_IGVM if found. The library is added to the system_ss target in backends/meson.build where the IGVM parsing will be performed by the ConfidentialGuestSupport object. Signed-off-by: Roy Hopkins <[email protected]>

In order to add support for parsing IGVM files for secure virtual machines, a the path to an IGVM file needs to be specified as part of the guest configuration. It makes sense to add this to the ConfidentialGuestSupport object as this is common to all secure virtual machines that potentially could support IGVM based configuration. This patch allows the filename to be configured via the QEMU object model in preparation for subsequent patches that will read and parse the IGVM file. Signed-off-by: Roy Hopkins <[email protected]>

In preparation for supporting the processing of IGVM files to configure guests, this adds a set of functions to ConfidentialGuestSupport allowing configuration of secure virtual machines that can be implemented for each supported isolation platform type such as Intel TDX or AMD SEV-SNP. These functions will be called by IGVM processing code in subsequent patches. This commit provides a default implementation of the functions that either perform no action or generate a warning or error when they are called. Targets that support ConfidentalGuestSupport should override these implementations. Signed-off-by: Roy Hopkins <[email protected]>

This commit adds an implementation of an IGVM loader which parses the file specified as a pararameter to ConfidentialGuestSupport and provides a function that uses the interface in the same object to configure and populate guest memory based on the contents of the file. The IGVM file is parsed when a filename is provided but the code to process the IGVM file is not yet hooked into target systems. This will follow in a later commit. Signed-off-by: Roy Hopkins <[email protected]>

An IGVM file contains configuration of a guest that supports confidential computing hardware. As part of the PC system initialization, the IGVM needs to be processed to apply this configuration before the guest is started. This patch introduces processing of a provided IGVM file at the end of the current PC initialization steps. If an IGVM file has been provided then the directives in the file are processed completing the initialization of the target. If no IGVM file has been specified by the user then no there is no intended consequences in these changes. Signed-off-by: Roy Hopkins <[email protected]>

…h IGVM When using an IGVM file the configuration of the system firmware is defined by IGVM directives contained in the file. In this case the user should not configure any pflash devices. This commit skips initialization of the ROM mode when pflash0 is not set then checks to ensure no pflash devices have been configured when using IGVM, exiting with an error message if this is not the case. Signed-off-by: Roy Hopkins <[email protected]>

The segment registers should be updated via a call to cpu_x86_load_seg_cache(). However, this only works for the general purpose segment registers and not LDTR or TR. In these cases, the function call results in an out-of-bounds access on the env->segs array. Fortunately the out-of-bounds access actually overflows into the correct segment registers. This commit fixes the handling of those segment registers. Signed-off-by: Roy Hopkins <[email protected]>

When an SEV guest is started, the reset vector and state are extracted from metadata that is contained in the firmware volume. In preparation for using IGVM to setup the initial CPU state, the code has been refactored to populate vmcb_save_area for each CPU which is then applied during guest startup and CPU reset. Signed-off-by: Roy Hopkins <[email protected]>

The ConfidentialGuestSupport object defines a number of virtual functions that are called during processing of IGVM directives to query or configure initial guest state. In order to support processing of IGVM files, these functions need to be implemented by relevant isolation hardware support code such as SEV. This commit implements the required functions for SEV-ES and adds support for processing IGVM files for configuring the guest. Signed-off-by: Roy Hopkins <[email protected]>

IGVM support has been implemented for Confidential Guests that support AMD SEV and AMD SEV-ES. Add some documentation that gives some background on the IGVM format and how to use it to configure a confidential guest. Signed-off-by: Roy Hopkins <[email protected]>

When sev_get_capabilities() is called on non-SEV platforms, an error is correctly generated stating it is not supported. The function should return NULL at this point but instead continues to further check for supported features, subsequently causing a crash. This commit adds the missing NULL return. Signed-off-by: Roy Hopkins <[email protected]>

For confidential guests a policy can be provided that defines the security level, debug status, expected launch measurement and other parameters that define the configuration of the confidential platform. This commit adds a new function named set_guest_policy() that can be implemented by each confidential platform, such as AMD SEV to set the policy. This will allow configuration of the policy from a multi-platform resource such as an IGVM file without the IGVM processor requiring specific implementation details for each platform. Signed-off-by: Roy Hopkins <[email protected]>

The initialization secions in IGVM files contains configuration that should be applied to the guest platform before it is started. This includes guest policy and other information that can affect the security level and measurement of a confidential guest. This commit refactors the existing section processing code to prepare it for handling initialization sections as well as directives and iterates the initialization sections in the IGVM file. Signed-off-by: Roy Hopkins <[email protected]>

Adds a handler for the guest policy initialization IGVM section and builds an SEV policy based on this information and the ID block directive if present. The policy is applied using by calling 'set_guest_policy()' on the ConfidentialGuestSupport object. Signed-off-by: Roy Hopkins <[email protected]>

The new cgs_set_guest_policy() function is provided to receive the guest policy flags, SNP ID block and SNP ID authentication from guest configuration such as an IGVM file and apply it to the platform prior to launching the guest. The policy is used to populate values for the existing 'policy', 'id_block' and 'id_auth' parameters. When provided, the guest policy is applied and the ID block configuration is used to verify the launch measurement and signatures. The guest is only successfully started if the expected launch measurements match the actual measurements and the signatures are valid. Signed-off-by: Roy Hopkins <[email protected]>

This fixes the handling of return codes from the KVM_SEV_SNP_LAUNCH_FINISH ioctl. The ioctl returns a success code if the command was successfully sent to the SEV PSP regardless of whether the command itself was successful or not. Any error from the PSP is returned instead via the '&error' parameter. This commit checks 'error' to see if the processing resulted in an error and reports it as required. Signed-off-by: Roy Hopkins <[email protected]>

The synchronization between VMSA and CPU state did not include the GDT, IDT, LDTR or TR registers. These registers had default, non-zero values, therefore if the VMSA did not match these values then a difference in launch measurement is observed. This change adds synchronization of these registers and adds them to the VMSA field check. Signed-off-by: Roy Hopkins <[email protected]>

Fix compile error due to no return statement in function returning non-void. Signed-off-by: Vijay Dhanraj <[email protected]>

Fix compile error when built without --enable-igvm

…_idblock Implement IGVM policy and ID block support

Always pair igvm_get_buffer() with igvm_free_buffer() before exiting the loop in supported_platform_compat_mask(). Signed-off-by: Peter Fang <[email protected]>

calmisi and others added 30 commits January 9, 2024 08:22

i386/cpuid: Decrease cpuid_i when skipping CPUID leaf 1F

eccfb38

Decrease array index cpuid_i when CPUID leaf 1F is skipped, otherwise it will get an all zero'ed CPUID entry with leaf 0 and subleaf 0. It conflicts with correct leaf 0. Signed-off-by: Xiaoyao Li <[email protected]>

i386/cpuid: Remove subleaf constraint on CPUID leaf 1F

83066e8

No such constraint that subleaf index needs to be less than 64. Signed-off-by: Xiaoyao Li <[email protected]>

trace/kvm: Split address space and slot id in trace_kvm_set_user_memo…

6f0194e

…ry() The upper 16 bits of kvm_userspace_memory_region::slot are address space id. Parse it separately in trace_kvm_set_user_memory(). Signed-off-by: Xiaoyao Li <[email protected]>

*** HACK *** linux-headers: Update headers to pull in gmem APIs

a4ef508

This patch needs to be updated by script scripts/update-linux-headers.sh once gmem fd support is upstreamed in Linux kernel. Signed-off-by: Xiaoyao Li <[email protected]>

physmem: replace function name with __func__ in ram_block_discard_ran…

132e2eb

…ge() Use __func__ to avoid hard-coded function name. Signed-off-by: Xiaoyao Li <[email protected]>

trace/kvm: Add trace for page convertion between shared and private

e807937

Signed-off-by: Isaku Yamahata <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]>

target/i386: Implement mc->kvm_type() to get VM type

89836cb

Implement mc->kvm_type() for i386 machines. It provides a way for user to create SW_PROTECTE_VM. Also store the vm_type in machinestate to other code to query what the VM type is. Signed-off-by: Xiaoyao Li <[email protected]>

memory: Introduce memory_region_init_ram_guest_memfd()

bbbd176

Introduce memory_region_init_ram_guest_memfd() to allocate private guset memfd on the MemoryRegion initialization. It's for the use case of TDVF, which must be private on TDX case. Signed-off-by: Xiaoyao Li <[email protected]>

Revert "target/i386: Parse TDX vm type"

48cd1d2

This reverts commit a1a6b66aa83d26bab1628b92b7720d5e4b672e7e.

*kvm: Use new KVM_MEM_GUEST_MEMFD flag instead of KVM_MEM_PRIVATE

0591737

TODO: squash into: kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot Signed-off-by: Michael Roth <[email protected]>

Update kernel headers for snp-host-v11

c2dee38

Signed-off-by: Michael Roth <[email protected]>

i386/sev: Add the SNP launch start context

065a76e

The SNP_LAUNCH_START is called first to create a cryptographic launch context within the firmware. Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Michael Roth <[email protected]>

i386/sev: Add support to encrypt BIOS when SEV-SNP is enabled

180c48d

The KVM_SEV_SNP_LAUNCH_UPDATE command is used for encrypting the bios image used for booting the SEV-SNP guest. Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Michael Roth <[email protected]>

mdroth and others added 28 commits January 9, 2024 08:22

*update headers for SNP hypervisor v11

86dfb3a

*drop use of KVM_GUEST_MEMFD_ALLOW_HUGEPAGE flag

3e81f8b

Test kernel now uses hugepages by default. Signed-off-by: Michael Roth <[email protected]>

*use new VMGEXIT handling for page-state changes

73a0ccc

Signed-off-by: Michael Roth <[email protected]>

*initial support for supplying certs for extended requests

8cbbc77

Signed-off-by: Michael Roth <[email protected]>

*fix non-SNP handling for memory regions

798d394

Signed-off-by: Michael Roth <[email protected]>

*skip RAMBlock notifiers for SNP

b6ee121

SEV uses these to pin pages, SNP doesn't need this since it relies on gmem for any private pages. Signed-off-by: Michael Roth <[email protected]>

backends/igvm: Fix compile error when built without --enable-igvm

72da3a9

Fix compile error due to no return statement in function returning non-void. Signed-off-by: Vijay Dhanraj <[email protected]>

Merge pull request coconut-svsm#9 from vijaydhanraj/svsm-igvm

026267e

Fix compile error when built without --enable-igvm

Merge pull request coconut-svsm#7 from roy-hopkins/snp-v3-wip-igvm_v2…

260df2b

…_idblock Implement IGVM policy and ID block support

backends/igvm: Fix memory leak

ef31de0

Always pair igvm_get_buffer() with igvm_free_buffer() before exiting the loop in supported_platform_compat_mask(). Signed-off-by: Peter Fang <[email protected]>

joergroedel requested a review from roy-hopkins June 12, 2024 18:54

joergroedel force-pushed the svsm-igvm branch from 260df2b to bcb7121 Compare December 17, 2024 11:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backends/igvm: Fix memory leak #12

backends/igvm: Fix memory leak #12

peterfang commented May 7, 2024

backends/igvm: Fix memory leak #12

Are you sure you want to change the base?

backends/igvm: Fix memory leak #12

Conversation

peterfang commented May 7, 2024