riscv-non-isa · wojciechozga · Mar 26, 2024 · Mar 26, 2024 · Apr 2, 2024 · Apr 2, 2024
diff --git a/specification/appendix_d.adoc b/specification/appendix_d.adoc
@@ -0,0 +1,8 @@
+[[appendix_d]]
+== Appendix D: M-mode TSM based deployment model
+
+[id=depd]
+[caption="Figure {counter:image}"]
+[title= ": M-mode TSM based deployment model for CoVE"]
+image::img_11.png[align=center]
+
diff --git a/specification/glossary.adoc b/specification/glossary.adoc
@@ -62,6 +62,8 @@ that hosts multiple mutually distrusting software owned by different tenants.
 
 | MTT | Memory Tracking Table (MTT).
 
+| Relying party | A relying party refer to a system providing access to a secured software application.
+
 | RISC-V Supervisor Domains | RISC-V privileged architecture <<R0>> defines
 the S-mode for execution of supervisor software. S-mode software may optionally
 enable the Hypervisor extension to host virtual machines. Typically, there is a
@@ -100,14 +102,15 @@ software, and firmware elements that are trusted by a relying party to
 protect the confidentiality and integrity of the relying parties' workload
 data and execution against a defined adversary model. In a system with
 separate processing elements within a package on a socket, the TCB
-boundary is the package. In a multi-socket system the TCB extends across
-the socket-to-socket interface, and is managed as one system TCB.
+boundary is the package. In a multi-socket system the Hardware TCB extends across
+the socket-to-socket interface, and is managed as one system TCB. The software TCB may  also extends
+across multiple sockets.
 
 | TEE | Trusted execution environment (TEE) is a set of hardware and software mechanisms that allow creating attestable and isolated execution environment.
 
 | TVM | TEE VM (TVM) also known as Confidential VM. It is a VM instantiation of an confidential workload.
 
-| Virtual Machine (VM) | Guest operating system hosted by a VMM.
+| VM | Virtual Machine (VM) is a guest operating system hosted by a VMM.
 
 | VMM | Virtual machine monitor (VMM) is used interchangeably with the term hypervisor in this document.
 

diff --git a/specification/images/img_11.png b/specification/images/img_11.png
diff --git a/specification/intro.adoc b/specification/intro.adoc
@@ -7,7 +7,7 @@ a scalable Trusted Execution Environment (TEE) for hardware virtual-machine-base
 workloads on RISC-V-based platforms. This CoVE interface specification enables
 application workloads that require confidentiality to reduce the Trusted
 Computing Base (TCB) to a minimal TCB, specifically, keeping the host OS/VMM,
-devices and other software outside the TCB.  Admitting devices into the TCB of CoVE
+devices and other software outside the TCB. Admitting devices into the TCB of CoVE
 TEE VMs is outside the scope of this specification and is described in the CoVE-IO
 specification.
 The proposed specification supports an

diff --git a/specification/overview.adoc b/specification/overview.adoc
@@ -18,10 +18,13 @@ of hardware-attested trusted execution environment called TEE Virtual Machines
 execution state and memory are run-time-isolated from the host OS/VMM and other
 platform software not in the TCB of the TVM. TVMs are protected from a broad
 set of software-based and hardware-based threats per the threat model described
-in <<threatmodel>>. The design describes an isolated (Confidential) Supervisor
+in <<threatmodel>>. The architecture describes an isolated (Confidential) Supervisor
 Domain to enforce TCB and confidentiality properties, while using an isolated
 (Hosting) Supervisor Domain for the host domain, thus maintaining the OS/VMMs
-role as the resource manager (for both legacy VMs and TVMs). The resources
+role as the resource manager (for both legacy VMs and TVMs). As will be described
+in this document the architecture can be
+implemented on processors that do not support supervisor domains (Smmtt extension). 
+When supervisor domains are available the resources
 managed by the hosting supervisor domain (OS/VMM) include memory, CPU, I/O
 resources and platform capabilities to host the TVM workload. The terms
 hosting supervisor domain and OS/VMM are used interchangeably in this
@@ -40,10 +43,9 @@ Manager is called the " *TEE Security Manager* " or *(TSM)* - it acts as the
 trusted intermediary between TEE and non-TEE workloads on the same platform.
 The TSM should have a minimal hardware-attested footprint. The TCB (which includes
 the TSM and hardware) enforces strict confidentiality and integrity security
-properties for workloads in this supervisor domain. The Root Security Manager
-is an M-mode software module (called the " *TSM-driver* ") which isolates the
-Confidential Supervisor Domain from all other Supervisor domains and other
-platform components (non-confidential and
+properties for workloads in this supervisor domain. The Root Security Manager, 
+also called the " *TSM-driver* ", isolates the Confidential Supervisor Domain 
+from all other Supervisor domains and other platform components (non-confidential and
 confidential). The responsibility of the TSM is to enforce the security
 objectives accorded to TEE workloads assigned to that supervisor domain. The
 VMM is expected to continue to manage the security for non-confidential
@@ -87,10 +89,12 @@ TVMs may be hosted by the host OS/VMM via confidential supervisor domains.
 Each TVM may consist of the guest firmware, a guest OS and applications. The
 software components included in the TVM are implementation specific.
 
-As shown in <<dep1>>, the M-mode firmware TSM-driver is in the TCB of all
-Supervisor domains and hence in the TCB for all CoVE workloads hosted on the
-platform. The TSM-driver (operating in M-mode) uses
-the hardware capabilities to provide:
+<<dep1>> shows the deplyment model in which the TSM-driver runs in M-mode and TSM runs 
+in HS-mode. Thus the TSM-driver is in the TCB of all supervisor domains and hence in 
+the TCB for all CoVE workloads hosted on the platform. Systems without the requirement 
+for multiple confidential supervisor domains might combine TSM-driver and TSM and run them together 
+the M-mode, see <<appendix_d>>. The TSM-driver (operating in M-mode) uses the hardware 
+capabilities to provide:
 
 * Isolation of memory associated with TEEs (including the TSM). We describe
 *Confidential memory* as memory that is subject to access-control,
@@ -112,26 +116,26 @@ as *COVH* that includes functions to manage the lifecycle of the TVM, such as
 creating, adding pages to a TVM, scheduling a TVM for execution, etc., in an
 OS/platform agnostic manner. The TSM also provides an ABI to the TVM contexts:
 A set of guest ABIs known as *COVG* that enables the TVM workload to request
-attestation functions, memory management functions, or paravirtualized IO.
+attestation functions, memory management functions, or paravirtualized IO. 
 
 In order to isolate the TVMs from the host OS/VMM and non-confidential VMs,
-the supervisor domains (that contain the TSM state) must be isolated first -
-this is achieved by enforcing isolation for memory assigned to the supervisor
-domain that the TSM occupies - this is called the *TSM-memory-region.* The
-TSM-memory-region is expected to be a static region of memory that holds the TSM
-code and data. This region must be access-controlled from all software outside
-the TCB (e.g., using Smmtt), and may be additionally protected against physical
-access via cryptographic mechanisms.
+the supervisor domains (that contain the TSM state) must be isolated first.
+This is achieved by enforcing isolation for memory assigned to the supervisor
+domain that the TSM occupies. This memory region is called *TSM-memory-region* and 
+is expected to be a static region of memory that holds the TSM code and data. 
+It must be access-controlled from all software outside the TCB (e.g., using Smmtt 
+or PMP), and may be additionally protected against physical access with help of 
+cryptographic mechanisms.
 
 Access to the TSM-memory-region and execution of code from the
 TSM-memory-region (for the TSM ABIs) is enforced in hardware via the maintenance
 of the execution context (ASID, VMID and SDID) maintained per hart. This context
 is enabled per-hart via the TEECALL interface to context switch into the
 confidential supervisor domain context via the TSM-driver and disabled
 via the TEERET interface to context restore to the hosting supervisor domain.
-Access to TEE-assigned memory is allowed for the hart when the access is
-permitted as per the active permissions enforced by the memory management unit (MMU) for the supervisor
-domain active on the hart (enforced through Sv and Smmtt for CoVE). This
+Access to TEE-assigned memory is allowed for the hart when access is
+permitted as per the active permissions enforced by the memory management unit (MMU) 
+for the supervisor domain active on the hart (enforced through Sv and Smmtt for CoVE). This
 per-hart execution context is used by the processor to enforce access-control
 properties on memory accessed by TEE workloads managed by the TSM. The
 details of the supervisor domain access protection is specified in the Smmtt
@@ -143,12 +147,13 @@ the security of the TVMs through the resource management actions of the
 OS/VMM. These security primitives require the TSM to enforce TVM virtual-hart
 state save and restore, as well as enforcing invariants for memory assigned
 to the TVM, including G-stage translation. The host OS/VMM provides the
-typical VM resource management functionality for memory, IO, etc.
+typical VM resource management functionality for memory, IO, and VM's lifecycle
+management.
 
-<<dep1>> shows Confidential VMs managed by a VMM and <<dep1a>> shows Confidential
-applications managed by an untrusted host OS. As evident from the architecture, the difference
-between these two scenarios is the software TCB (owned by the tenant within
-the TVM) for the tenant workload - in the application TEE case, a minimal
+<<dep1>> shows TVMs (a.k.a. confidential VMs) managed by a VMM and <<dep1a>> shows Confidential
+applications managed by an untrusted host OS. As evident from the architecture, the 
+difference between these two scenarios is the software TCB (owned by the tenant within
+the TVM) for the tenant workload in the application TEE case, a minimal
 guest runtime may be used; whereas in the VM TEE case, an enlightened
 guest OS is expected in the TVM TCB. Other software models that map to the VU/VS
 modes of operation are also possible as TEE workloads. Importantly, the hardware
@@ -161,6 +166,6 @@ CoVE ABI.
 image::img_1.png[]
 
 The detailed architecture is described in the Section <<refarch>>. Note that the
-architecture described above may have various implementations, however the goal
+architecture described above may have various implementations. However, the goal
 of this specification is to propose a reference architecture and ratify a
-normative CoVE ABI for Confidential VMs as a RISC-V non-ISA specification.
+normative CoVE ABI for TVMs as a RISC-V non-ISA specification.
diff --git a/specification/refarch.adoc b/specification/refarch.adoc
@@ -9,43 +9,52 @@ the properties of the TSM, its instantiation, isolation and operational model
 for the TVM life cycle. The description in this section refers to the reference
 architecture in Figure 1.
 
-=== CoVE Memory Isolation
-
-Memory isolation for TVMs is orchestrated by the TSM-driver and the TSM in two
-phases: the conversion of memory to confidential memory and the assignment of
-confidential memory (alongwith the enforcement of properties on use) to TVMs.
-To enforce isolation across Host and Confidential supervisor domains, CoVE
-requires isolation of physical memory (that supports paging when enabled). There
-are two deployment models described below (1 and 2). CoVE ABI is applicable for both
-modes - this specification focuses on the first deployment model (1) where a
+=== CoVE Deplyment Models
+There are three deployment models described below (1, 2, and 3). CoVE ABI is applicable for 
+all of them. This specification focuses mainly on the first deployment model (1) where a
 primary host supervisor domain is used to host confidential workloads in a
 secondary confidential domain.
 
 . The TSM operates in S/HS mode as a peer supervisor domain manager to the
 hosting supervisor domain which operates in S/HS mode as well. This model uses
-the Memory Tracking Table (MTT) along with G-stage page tables (PT) for confidential TVM isolation (where the 1st
+the Memory Tracking Table (MTT) along with G-stage page tables (PT) for TVM isolation (where the 1st
 stage PT is used by the Guest OS normally). The MTT is used to assign physical
 memory to the Confidential supervisor domain called *Confidential* memory and
 memory accessible to the hosting supervisor domain called *Non-Confidential*.
 MTT allows dynamic programming of the per-domain access permissions. This model
 is shown in <<dep1>>
 
 . The TSM is the only root HS mode component on the platform, hence, G-stage
-page tables (PT) can be used to enforce isolation between confidential TVMs and
+page tables (PT) can be used to enforce isolation between TVMs and
 ordinary VMs. In this model the host VMM must execute in the de-privileged VS
 mode and the TSM must provide nested virtualization of the H-extension controls.
 This model may be suitable for client/embedded systems and is shown in <<dep2>>.
 
-A TVM and/or TSM needs to access both types of memory:
+. The TSM runs along the TSM-driver in M-mode, allowing only for a single confidential 
+supervisor domain. This model enforces isolation between TVMs using MMU (G-stage PT) and 
+between hosting supervisor domain (OS/VMM, VMs) and confidential supervisor domain using PMPs.
+This model might be suitable for client/embedded systems and is shown in <<dep3>>.
 
+=== CoVE Memory Isolation
+Memory isolation for TVMs is orchestrated by the TSM-driver and the TSM in two
+phases: the conversion of memory to confidential memory and the assignment of
+confidential memory (alongwith the enforcement of properties on use) to TVMs.
+To enforce isolation across Host and Confidential supervisor domains, CoVE
+requires isolation of physical memory (that supports paging when enabled). 
+
+CoVE distinguishes between two types of memory:
 * Confidential memory - used for TVM/TSM code and security-sensitive data;
 including state such as 1st-stage, G-stage page tables.
 * Non-confidential memory - used only for shared data, e.g., communication
 between the TVM/TSM and the non-TCB host software and/or non-TCB IO devices.
 
-The TSM COVH ABI provides interfaces to the OS/VMM to convert / donate
-memory from the hosting supervisor domain to the confidential supervisor domain.
-Similarly, a separate ABI intrinsic is used to reclaim memory back from the
+The split of memory into confidential and non-confidential might be static or dynamic. 
+Static partitioning occurs during platform initialization while the dynamic one during runtime.
+The latter comes allows for better resource utilization at the cost of more complexity. 
+
+To support the dynamic memory partitioning, TSM implements COVH ABI enabling OS/VMM to convert / donate memory from the 
+hosting supervisor domain to the confidential supervisor domain. Similarly, a separate ABI intrinsic 
+is used to reclaim memory back from the
 confidential supervisor domain to the hosting supervisor domain. Once physical
 memory is converted to confidential - it is accessible only to the confidential
 supervisor domain. By default, TVM memory is assigned by the TSM (which
@@ -57,12 +66,12 @@ mitigate attacks from non-TCB software. The TSM enforces isolation between TVMs
 by using the G-stage page table.
 
 * Hart operating with the confidential supervisor domain context has MTT
-permissions to access Confidential and Non-confidential memory
+permissions to access Confidential and Non-confidential memory,
 * Hart not operating in a Confidential supervisor domain has access permissions
-only for Non-confidential memory
+only for Non-confidential memory.
 
-The RDSM configures the MTT such that a hart executing in the hosting domain
-does not have access to any confidential memory regions. The RDSM configures the
+The root domain security manager (RDSM) configures the MTT such that a hart executing 
+in the hosting domain does not have access to any confidential memory regions. The RDSM configures the
 MTT for the confidential domain to allow access to confidential memory
 exclusively to that domain, but may also allow access to non-confidential
 (shared) memory regions to one or more secondary domains.
@@ -80,16 +89,19 @@ unique memory encryption key. These additional protection aspects are platform
 and implementation dependent.
 ====
 
-Confidential and non-confidential memory are both always assigned by the VMM,
-i.e., the hosting supervisor domain - the TSM-driver is expected to manage the
+In deployment models that allow for dynamic memory partitioning,
+confidential and non-confidential memory are both always assigned by the VMM,
+i.e., the hosting supervisor domain. The TSM-driver is expected to manage the
 isolation for confidential memory assigned to any of the secondary supervisor
 domains by programming the Memory Tracking Table (MTT). The desired security
 properties of memory tracking are discussed below. The TSM (within a supervisor
 domain) manages page-based allocation using the G-stage page table from the set
 of confidential memory regions that are enforced by the MTT.
 
 Four aspects of memory isolation are impacted due to this dynamic configurable
-property of the MTT:
+property of the MTT and are discussed next: (1) address translation/page walk, (2) management of isolation 
+for confidential physical memory, (3) handling implicit & explicit memory accesses, and
+(4) cached translations/TLB management.
 
 ==== Address Translation/Page Walk
 Figure 2 describes a reference model for memory tracking lookup where
@@ -103,7 +115,7 @@ the paging sizes/modes supported by the hart.
 [title= "Memory Tracking for Supervisor Domains"]
 image::https://github.com/riscv/riscv-smmtt/blob/main/images/fig2.png?raw=true[]
 
-==== Management of isolation for Confidential Physical Memory
+==== Management of Isolation for Confidential Physical Memory
 
 The software TCB (specifically TSM) manages the assignment of physical memory to the Confidential
 supervisor domain, while the hardware TCB (specifically the hart MMU including virtual memory system,
@@ -282,7 +294,7 @@ concurrency controls on internal data structures and per-TVM global data
 structures (such as the G-stage page table structures).
 
 [caption="Figure {counter:image}: ", reftext="Figure {image}"]
-[title= "TSM operation - Interruptible and non-reentrant TSM model shown."]
+[title= "TSM operation: Interruptible and non-reentrant TSM model according to the deployment model 1."]
 image::img_3.png[]
 
 A TSM entry triggered by an ECALL (with CoVE extension type) by the OS/VMM
@@ -417,10 +429,12 @@ and the TSM-driver.
 There are two facets of TVM and TSM memory isolation that are
 implementation-specific:
 
-*a)* *Isolation from host software access* -  For the deployment model 2,
-the CPU must enforce hardware-based access-control of TSM memory via
-the G-stage page tables to prevent the guest VMM from accessing TSM
-memory. For the deployment model 1, the CPU must also similarly enforce
+*a)* *Isolation from host software access* - For the deployment model 3,
+the CPU must enforce hardware-based access-control of TSM memory via a hardware
+memory isolation mechanism (e.g., PMP) configurable only by TCB. 
+The deployment model 2, enforces such access-control via the G-stage page tables, 
+preventing the guest VMM from accessing TSM memory. 
+For the deployment model 1, the CPU must also similarly enforce
 access-control of TSM memory to prevent access from host supervisor
 domain components (VMM and host OS that operate in V=0, HS-mode) software.
 Since in this deployment model, other supervisor domains have access to 1st